Artificial chromosomes, uses thereof and methods for preparing artificial chromosomes

ABSTRACT

Methods for amplification of nucleic acids in cells are provided. Also provided are nucleic acid molecules for amplification.

RELATED APPLICATIONS

[0001] This application is a divisional of copending U.S. applicationSer. No. 09/724,693, filed Nov. 28, 2000, to GYULA HADLACZKY, entitledARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARINGARTIFICIAL CHROMOSOMES.

[0002] This application is a continuation of copending U.S. applicationSer. No. 08/835,682, now abandoned, filed Apr. 10, 1997, to GYULAHADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USESTHEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES. Thisapplication is also a continuation-in-part of U.S. application Ser. No.08/695,191, filed Aug. 7, 1996, now U.S. Pat. No. 6,025,155, to GYULAHADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USESTHEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES. Thisapplication is also continuation-in-part of U.S. application Ser. No.08/682,080, filed Jul. 15, 1996, now U.S. Pat. No. 6,077,697, to GYULAHADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USESTHEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES, and is also acontinuation-in-part of copending U.S. application Ser. No. 08/629,822,now abandoned, filed Apr. 10, 1996 to GYULA HADLACZKY and ALADAR SZALAY,entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARINGARTIFICIAL CHROMOSOMES.

[0003] U.S. application Ser. No. 08/835,682 is a continuation-in-part ofU.S. application Ser. No. 08/695,191. U.S. application Ser. No.08/695,191 is a continuation-in-part of U.S. application Ser. No.08/682,080 and also is a continuation-in-part of U.S. application Ser.No. 08/629,822. U.S. application Ser. No. 08/682,080 is acontinuation-in-part of U.S. application Ser. No. 08/629,822.

[0004] This application is related to U.S. application Ser. No.07/759,558, now U.S. Pat. No. 5,288,625, is related to U.S. applicationSer. No. 08/734,344, filed Oct. 21, 1996, and is related to U.S.application Ser. No. 08/375,271, filed Jan. 19, 1995, now U.S. Pat. No.5,712,134. U.S. application Ser. No. 08/375,271 is a continuation ofU.S. application Ser. No. 08/080,097, filed Jun. 23, 1993 which is acontinuation of U.S. application Ser. No. 07/892,487, filed Jun. 3,1992, which is a continuation of U.S. application Ser. No. 07/521,073,filed May 9, 1990.

[0005] The subject matter of each of the above-noted U.S. applicationsand patents is incorporated in its entirety by reference thereto.

FIELD OF THE INVENTION

[0006] The present invention relates to methods for preparing cell linesthat contain artificial chromosomes, methods for isolation of theartificial chromosomes, targeted insertion of heterologous DNA into thechromosomes, delivery of the chromosomes to selected cells and tissuesand methods for isolation and large-scale production of the chromosomes.Also provided are cell lines for use in the methods, and cell lines andchromosomes produced by the methods. Further provided are cell-basedmethods for production of heterologous proteins, gene therapy methodsand methods of generating transgenic animals, particularly non-humantransgenic animals, that use artificial chromosomes.

BACKGROUND OF THE INVENTION

[0007] Several viral vectors, non-viral, and physical delivery systemsfor gene therapy and recombinant expression of heterologous nucleicacids have been developed [see, e.g., Mitani et al. (1993) TrendsBiotech. 11:162-166]. The presently available systems, however, havenumerous limitations, particularly where persistent, stable, orcontrolled gene expression is required. These limitations include: (1)size limitations because there is a limit, generally on order of aboutten kilobases [kB], at most, to the size of the DNA insert [gene] thatcan be accepted by viral vectors, whereas a number of mammalian genes ofpossible therapeutic importance are well above this limit, especially ifall control elements are included; (2) the inability to specificallytarget integration so that random integration occurs which carries arisk of disrupting vital genes or cancer suppressor genes; (3) theexpression of randomly integrated therapeutic genes may be affected bythe functional compartmentalization in the nucleus and are affected bychromatin-based position effects; (4) the copy number and consequentlythe expression of a given gene to be integrated into the genome cannotbe controlled. Thus, improvements in gene delivery and stable expressionsystems are needed [see, e.g., Mulligan (1993) Science 260:926-932].

[0008] In addition, safe and effective vectors and gene therapy methodsshould have numerous features that are not assured by the presentlyavailable systems. For example, a safe vector should not contain DNAelements that can promote unwanted changes by recombination or mutationin the host genetic material, should not have the potential to initiatedeleterious effects in cells, tissues, or organisms carrying the vector,and should not interfere with genomic functions. In addition, it wouldbe advantageous for the vector to be non-integrative, or designed forsite-specific integration. Also, the copy number of therapeutic gene(s)carried by the vector should be controlled and stable, the vector shouldsecure the independent and controlled function of the introducedgene(s); and the vector should accept large (up to Mb size) inserts andensure the functional stability of the insert.

[0009] The limitations of existing gene delivery technologies, however,argue for the development of alternative vector systems suitable fortransferring large [up to Mb size or larger] genes and gene complexestogether with regulatory elements that will provide a safe, controlled,and persistent expression of the therapeutic genetic material.

[0010] At the present time, none of the available vectors fulfill allthese requirements. Most of these characteristics, however, arepossessed by chromosomes. Thus, an artificial chromosome would be anideal vector for gene therapy, as well as for stable, high-level,controlled production of gene products that require coordination ofexpression of numerous genes or that are encoded by large genes, andother uses. Artificial chromosomes for expression of heterologous genesin yeast are available, but construction of defined mammalian artificialchromosomes has not been achieved. Such construction has been hinderedby the lack of an isolated, functional, mammalian centromere anduncertainty regarding the requisites for its production and stablereplication. Unlike in yeast, there are no selectable genes in closeproximity to a mammalian centromere, and the presence of long runs ofhighly repetitive pericentric heterochromatic DNA makes the isolation ofa mammalian centromere using presently available methods, such aschromosome walking, virtually impossible. Other strategies are requiredfor production of mammalian artificial chromosomes, and some have beendeveloped. For example, U.S. Pat. No. 5,288,625 provides a cell linethat contains an artificial chromosome, a minichromosome, that is about20 to 30 megabases. Methods provided for isolation of these chromosomes,however, provide preparations of only about 10-20% purity. Thus,development of alternative artificial chromosomes and perfection ofisolation and purification methods as well as development of moreversatile chromosomes and further characterization of theminichromosomes is required to realize the potential of this technology.

[0011] Therefore, it is an object herein to provide mammalian artificialchromosomes and methods for introduction of foreign DNA into suchchromosomes. It is also an object herein to provide methods of isolationand purification of the chromosomes. It is also an object herein toprovide methods for introduction of the mammalian artificial chromosomeinto selected cells, and to provide the resulting cells, as well astransgenic non-human animals, birds, fish and plants that contain theartificial chromosomes. It is also an object herein to provide methodsfor gene therapy and expression of gene products using artificialchromosomes. It is a further object herein to provide methods forconstructing species-specific artificial chromosomes de novo. Anotherobject herein is to provide methods to generate de novo mammalianartificial chromosomes.

SUMMARY OF THE INVENTION

[0012] Mammalian artificial chromosomes [MACs] are provided. Alsoprovided are artificial chromosomes for other higher eukaryotic species,such as insects, birds, fowl and fish, produced using the MACS andmethods provided herein. Methods for generating and isolating suchchromosomes are provided. Methods using the MACs to construct artificialchromosomes from other species, such as insect, bird, fowl and fishspecies are also provided. The artificial chromosomes are fullyfunctional stable chromosomes. Two types of artificial chromosomes areprovided. One type, herein referred to as SATACs [satellite artificialchromosomes or satellite DNA based artificial chromosomes (the terms areused interchangeably herein)] are stable heterochromatic chromosomes,and the other type are minichromosomes based on amplification ofeuchromatin.

[0013] Artificial chromosomes provide an extra-genomic locus fortargeted integration of megabase [Mb] pair size DNA fragments thatcontain single or multiple genes, including multiple copies of a singlegene operatively linked to one promoter or each copy or several copieslinked to separate promoters. Thus, methods using the MACs to introducethe genes into cells, tissues, and animals, as well as species such asbirds, fowl, fish and plants, are also provided. The artificialchromosomes with integrated heterologous DNA may be used in methods ofgene therapy, in methods of production of gene products, particularlyproducts that require expression of multigenic biosynthetic pathways,and also are intended for delivery into the nuclei of germline cells,such as embryo-derived stem cells [ES cells], for production oftransgenic (non-human) animals, birds, fowl and fish. Transgenic plants,including monocots and dicots, are also contemplated herein.

[0014] Mammalian artificial chromosomes provide extra-genomic specificintegration sites for introduction of genes encoding proteins ofinterest and permit megabase size DNA integration so that, for example,genes encoding an entire metabolic pathway or a very large gene, such asthe cystic fibrosis [CF; ˜250 kb] genomic DNA gene, several genes, suchas multiple genes encoding a series of antigens for preparation of amultivalent vaccine, can be stably introduced into a cell. Vectors fortargeted introduction of such genes, including the tumor suppressorgenes, such as p53, the cystic fibrosis transmembrane regulator cDNA[CFTR], and the genes for anti-HIV ribozymes, such as an anti-HIV gagribozyme gene, into the artificial chromosomes are also provided.

[0015] The chromosomes provided herein are generated by introducingheterologous DNA that includes DNA encoding one or multiple selectablemarker(s) into cells, preferably a stable cell line, growing the cellsunder selective conditions, and identifying from among the resultingclones those that include chromosomes with more than one centromereand/or fragments thereof. The amplification that produces the additionalcentromere or centromeres occurs in cells that contain chromosomes inwhich the heterologous DNA has integrated near the centromere in thepericentric region of the chromosome. The selected clonal cells are thenused to generate artificial chromosomes.

[0016] Although non-targeted introduction of DNA, which results in somefrequency of integration into appropriate loci, targeted introduction ispreferred. Hence, in preferred embodiments, the DNA with the selectablemarker that is introduced into cells to initiate generation ofartificial chromosomes includes sequences that target it to the anamplifiable region, such as the pericentric region, heterochromatin, andparticularly rDNA of the chromosome. For example, vectors, such aspTEMPUD and pHASPUD [provided herein], which include such DNA specificfor mouse satellite DNA and human satellite DNA, respectively, areprovided. The plasmid pHASPUD is a derivative of pTEMPUD that containshuman satellite DNA sequences that specifically target humanchromosomes. Preferred targeting sequences include mammalian ribosomalRNA (rRNA) gene sequences (referred to herein as rDNA) which target theheterologous DNA to integrate into the rDNA region of those chromosomesthat contain rDNA. For example, vectors, such as pTERPUD, which includemouse rDNA, are provided. Upon integration into existing chromosomes inthe cells, these vectors can induce the amplification that results ingeneration of additional centromeres.

[0017] Artificial chromosomes are generated by culturing the cells withthe multicentric, typically dicentric, chromosomes under conditionswhereby the chromosome breaks to form a minichromosome and formerlydicentric chromosome. Among the MACs provided herein are the SATACs,which are primarily made up of repeating units of short satellite DNAand are nearly fully heterochromatic, so that without insertion ofheterologous or foreign DNA, the chromosomes preferably contain nogenetic information or contain only non-protein-encoding gene sequencessuch as rDNA sequences. They can thus be used as “safe” vectors fordelivery of DNA to mammalian hosts because they do not contain anypotentially harmful genes. The SATACs are generated, not from theminichromosome fragment as, for example, in U.S. Pat. No. 5,288,625, butfrom the fragment of the formerly dicentric chromosome.

[0018] In addition, methods for generating euchromatic minichromosomesand the use thereof are also provided herein. Methods for generating onetype of MAC, the minichromosome, previously described in U.S. Pat. No.5,288,625, and the use thereof for expression of heterologous DNA areprovided. In a particular method provided herein for generating a MAC,such as a minichromosome, heterologous DNA that includes mammalian rDNAand one or more selectable marker genes is introduced into cells whichare then grown under selective conditions. Resulting cells that containchromosomes with more than one centromere are selected and culturedunder conditions whereby the chromosome breaks to form a minichromosomeand a formerly multicentric (typically dicentric) chromosome from whichthe minichromosome was released.

[0019] Cell lines containing the minichromosome and the use thereof forcell fusion are also provided. In one embodiment, a cell line containingthe mammalian minichromosome is used as recipient cells for donor DNAencoding a selected gene or multiple genes. To facilitate integration ofthe donor DNA into the minichromosome, the recipient cell linepreferably contains the minichromosome but does not also contain theformerly dicentric chromosome. This may be accomplished by methodsdisclosed herein such as cell fusion and selection of cells that containa minichromosome and no formerly dicentric chromosome. The donor DNA islinked to a second selectable marker and is targeted to and integratedinto the minichromosome. The resulting chromosome is transferred by cellfusion into an appropriate recipient cell line, such as a Chinesehamster cell line [CHO]. After large-scale production of the cellscarrying the engineered chromosome, the chromosome is isolated. Inparticular, metaphase chromosomes are obtained, such as by addition ofcolchicine, and they are purified from the cell lysate. Thesechromosomes are used for cloning, sequencing and for delivery ofheterologous DNA into cells.

[0020] Also provided are SATACs of various sizes that are formed byrepeated culturing under selective conditions and subcloning of cellsthat contain chromosomes produced from the formerly dicentricchromosomes. The exemplified SATACs are based on repeating DNA unitsthat are about 15 Mb [two ˜7.5 Mb blocks]. The repeating DNA unit ofSATACs formed from other species and other chromosomes may vary, buttypically would be on the order of about 7 to about 20 Mb. The repeatingDNA units are referred to herein as megareplicons, which in theexemplified SATACs contain tandem blocks of satellite DNA flanked bynon-satellite DNA, including heterologous DNA and non-satellite DNA.Amplification produces an array of chromosome segments [each called anamplicon] that contain two inverted megareplicons bordered byheterologous [“foreign”] DNA. Repeated cell fusion, growth on selectivemedium and/or BrdU [5-bromodeoxyuridine] treatment or other treatmentwith other genome destabilizing reagent or agent, such as ionizingradiation, including X-rays, and subcloning results in cell lines thatcarry stable heterochromatic or partially heterochromatic chromosomes,including a 150-200 Mb “sausage” chromosome, a 500-1000 Mbgigachromosome, a stable 250-400 Mb megachromosome and various smallerstable chromosomes derived therefrom. These chromosomes are based onthese repeating units and can include heterologous DNA that isexpressed.

[0021] Thus, methods for producing MACs of both types (i.e., SATACS andminichromosomes) are provided. These methods are applicable to theproduction of artificial chromosomes containing centromeres derived fromany higher eukaryotic cell, including mammals, birds, fowl, fish,insects and plants.

[0022] The resulting chromosomes can be purified by methods providedherein to provide vectors for introduction of heterologous DNA intoselected cells for production of the gene product(s) encoded by theheterologous DNA, for production of transgenic (non-human) animals,birds, fowl, fish and plants or for gene therapy.

[0023] In addition, methods and vectors for fragmenting theminichromosomes and SATACs are provided. Such methods and vectors can beused for in vivo generation of smaller stable artificial chromosomes.Vectors for chromosome fragmentation are used to produce an artificialchromosome that contains a megareplicon, a centromere and two telomeresand will be between about 7.5 Mb and about 60 Mb, preferably betweenabout 10 Mb-15 Mb and 30-50 Mb. As exemplified herein, the preferredrange is between about 7.5 Mb and 50 Mb. Such artificial chromosomes mayalso be produced by other methods.

[0024] Isolation of the 15 Mb [or 30 Mb amplicon containing two 15 Mbinverted repeats] or a 30 Mb or higher multimer, such as 60 Mb, thereofshould provide a stable chromosomal vector that can be manipulated invitro. Methods for reducing the size of the MACs to generate smallerstable self-replicating artificial chromosomes are also provided.

[0025] Also provided herein, are methods for producing mammalianartificial chromosomes, including those provided herein, in vitro, andthe resulting chromosomes. The methods involve in vitro assembly of thestructural and functional elements to provide a stable artificialchromosome. Such elements include a centromere, two telomeres, at leastone origin of replication and filler heterochromatin, e.g., satelliteDNA. A selectable marker for subsequent selection is also generallyincluded. These specific DNA elements may be obtained from theartificial chromosomes provided herein such as those that have beengenerated by the introduction of heterologous DNA into cells and thesubsequent amplification that leads to the artificial chromosome,particularly the SATACs. Centromere sequences for use in the in vitroconstruction of artificial chromosomes may also be obtained by employingthe centromere cloning methods provided herein. In preferredembodiments, the sequences providing the origin of replication, inparticular, the megareplicator, are derived from rDNA. These sequencespreferably include the rDNA origin of replication and amplificationpromoting sequences.

[0026] Methods and vectors for targeting heterologous DNA into theartificial chromosomes are also provided as are methods and vectors forfragmenting the chromosomes to produce smaller but stable andself-replicating artificial chromosomes.

[0027] The chromosomes are introduced into cells to produce stabletransformed cell lines or cells, depending upon the source of the cells.Introduction is effected by any suitable method including, but notlimited to electroporation, direct uptake, such as by calcium phosphateprecipitation, uptake of isolated chromosomes by lipofection, bymicrocell fusion, by lipid-mediated carrier systems or other suitablemethod. The resulting cells can be used for production of proteins inthe cells. The chromosomes can be isolated and used for gene delivery.Methods for isolation of the chromosomes based on the DNA content of thechromosomes, which differs in MACs versus the authentic chromosomes, areprovided. Also provided are methods that rely on content, particularlydensity, and size of the MACs.

[0028] These artificial chromosomes can be used in gene therapy, geneproduct production systems, production of humanized geneticallytransformed animal organs, production of transgenic plants and animals(non-human), including mammals, birds, fowl, fish, invertebrates,vertebrates, reptiles and insects, any organism or device that wouldemploy chromosomal elements as information storage vehicles, and alsofor analysis and study of centromere function, for the production ofartificial chromosome vectors that can be constructed in vitro, and forthe preparation of species-specific artificial chromosomes. Theartificial chromosomes can be introduced into cells usingmicroinjection, cell fusion, microcell fusion, electroporation, nucleartransfer, electrofusion, projectile bombardment, nuclear transfer,calcium phosphate precipitation, lipid-mediated transfer systems andother such methods. Cells particularly suited for use with theartificial chromosomes include, but are not limited to plant cells,particularly tomato, arabidopsis, and others, insect cells, includingsilk worm cells, insect larvae, fish, reptiles, amphibians, arachnids,mammalian cells, avian cells, embryonic stem cells, haematopoietic stemcells, embryos and cells for use in methods of genetic therapy, such aslymphocytes that are used in methods of adoptive immunotherapy and nerveor neural cells. Thus methods of producing gene products and transgenic(non-human) animals and plants are provided. Also provided are theresulting transgenic animals and plants.

[0029] Exemplary cell lines that contain these chromosomes are alsoprovided.

[0030] Methods for preparing artificial chromosomes for particularspecies and for cloning centromeres are also provided. For example, twoexemplary methods provided for generating artificial chromosomes for usein different species are as follows. First, the methods herein may beapplied to different species. Second, means for generatingspecies-specific artificial chromosomes and for cloning centromeres areprovided. In particular, a method for cloning a centromere from ananimal or plant is provided by preparing a library of DNA fragments thatcontain the genome of the plant or animal and introducing each of thefragments into a mammalian satellite artificial chromosome [SATAC] thatcontains a centromere from a species, generally a mammal, different fromthe selected plant or animal, generally a non-mammal, and a selectablemarker. The selected plant or animal is one in which the mammalianspecies centromere does not function. Each of the SATACs is introducedinto the cells, which are grown under selective conditions, and cellswith SATACs are identified. Such SATACS should contain a centromereencoded by the DNA from the library or should contain the necessaryelements for stable replication in the selected species.

[0031] Also provided are libraries in which the relatively largefragments of DNA are contained on artificial chromosomes.

[0032] Transgenic (non-human) animals, invertebrates and vertebrates,plants and insects, fish, reptiles, amphibians, arachnids, birds, fowl,and mammals are also provided. Of particular interest are transgenic(non-human) animals and plants that express genes that confer resistanceor reduce susceptibility to disease. For example, the transgene mayencode a protein that is toxic to a pathogen, such as a virus, bacteriumor pest, but that is not toxic to the transgenic host. Furthermore,since multiple genes can be introduced on a MAC, a series of genesencoding an antigen can be introduced, which upon expression will serveto immunize [in a manner similar to a multivalent vaccine] the hostanimal against the diseases for which exposure to the antigens provideimmunity or some protection.

[0033] Also of interest are transgenic (non-human) animals that serve asmodels of certain diseases and disorders for use in studying the diseaseand developing therapeutic treatments and cures thereof. Such animalmodels of disease express genes [typically carrying a disease-associatedmutation], which are introduced into the animal on a MAC and whichinduce the disease or disorder in the animal. Similarly, MACs carryinggenes encoding antisense RNA may be introduced into animal cells togenerate conditional “knock-out” transgenic (non-human) animals. In suchanimals, expression of the antisense RNA results in decreased orcomplete elimination of the products of genes corresponding to theantisense RNA. Of further interest are transgenic mammals that harborMAC-carried genes encoding therapeutic proteins that are expressed inthe animal's milk. Transgenic (non-human) animals for use inxenotransplantation, which express MAC-carried genes that serve tohumanize the animal's organs, are also of interest. Genes that might beused in humanizing animal organs include those encoding human surfaceantigens.

[0034] Methods for cloning centromeres, such as mammalian centromeres,are also provided. In particular, in one embodiment, a library composedof fragments of SATACs are cloned into YACs [yeast artificialchromosomes] that include a detectable marker, such as DNA encodingtyrosinase, and then introduced into mammalian cells, such as albinomouse embryos. Mice produced from embryos containing such YACs thatinclude a centromere that functions in mammals will express thedetectable marker. Thus, if mice are produced from albino mouse embryosinto which a functional mammalian centromere was introduced, the micewill be pigmented or have regions of pigmentation.

[0035] A method for producing repeated tandem arrays of DNA is provided.This method, exemplified herein using telomeric DNA, is applicable toany repeat sequence, and in particular, low complexity repeats. Themethod provided herein for synthesis of arrays of tandem DNA repeats arebased in a series of extension steps in which successive doublings of asequence of repeats results in an exponential expansion of the array oftandem repeats. An embodiment of the method of synthesizing DNAfragments containing tandem repeats may generally be described asfollows. Two oligonucleotides are used as starting materials.Oligonucleotide 1 is of length k of repeated sequence (the flanks ofwhich are not relevant) and contains a relatively short stretch (60-90nucleotides) of the repeated sequence, flanked with appropriately chosenrestriction sites:

5′-S1>>>>>>>>>>>>>>>>>>>>>>>>>>>S2_(—)3′

[0036] where S1 is restriction site 1 cleaved by E1, S2 is a secondrestriction site cleaved by E2> represents a simple repeat unit, and ‘_’denotes a short (8-10) nucleotide flanking sequence complementary tooligonucleotide 2:

3′- S3-5′

[0037] where S3 is a third restriction site for enzyme E3 and which ispresent in the vector to be used during the construction. The methodinvolves the following steps: (1) oligonucleotides 1 and 2 are annealed;(2) the annealed oligonucleotides are filled-in to produce adouble-stranded (ds) sequence; (3) the double-stranded DNA is cleavedwith restriction enzymes E1 and E3 and subsequently ligated into avector (e.g., pUC19 or a yeast vector) that has been cleaved with thesame enzymes E1 and E3; (4) the insert is isolated from a first portionof the plasmid by digesting with restriction enzymes E1 and E3, and asecond portion of the plasmid is cut with enzymes E2 (treated to removethe 3′-overhang) and E3, and the large fragment (plasmid DNA plus theinsert) is isolated; (5) the two DNA fragments (the S1-S3 insertfragment and the vector plus insert) are ligated; and (6) steps 4 and 5are repeated as many times as needed to achieve the desired repeatsequence size. In each extension cycle, the repeat sequence sizedoubles, i.e., if m is the number of extension cycles, the size of therepeat sequence will be k×2^(m) nucleotides.

DESCRIPTION OF THE DRAWINGS

[0038]FIG. 1 is a schematic drawing depicting formation of the MMCneo[the minichromosome] chromosome. A-G represents the successive eventsconsistent with observed data that would lead to the formation andstabilization of the minichromosome.

[0039]FIG. 2 shows a schematic summary of the manner in which theobserved new chromosomes would form, and the relationships among thedifferent de novo formed chromosomes. In particular, this figure shows aschematic drawing of the de novo chromosome formation initiated in thecentromeric region of mouse chromosome 7. (A) A single E-typeamplification in the centromeric region of chromosome 7 generates aneo-centromere linked to the integrated “foreign” DNA, and forms adicentric chromosome. Multiple E-type amplification forms the ineo-chromosome, which separates from the remainder of mouse chromosome 7through a specific breakage between the centromeres of the dicentricchromosome and which was stabilized in a mouse-hamster hybrid cell line;(B) Specific breakage between the centromeres of a dicentric chromosome7 generates a chromosome fragment with the neo-centromere, and achromosome 7 with traces of heterologous DNA at the end; (C) Invertedduplication of the fragment bearing the neo-centromere results in theformation of a stable neo-minichromosome; (D) Integration of exogenousDNA into the heterologous DNA region of the formerly dicentricchromosome 7 initiates H-type amplification, and the formation of aheterochromatic arm. By capturing a euchromatic terminal segment, thisnew chromosome arm is stabilized in the form of the “sausage”chromosome; (E) BrdU [5-bromodeoxyuridine] treatment and/or drugselection induce further H-type amplification, which results in theformation of an unstable gigachromosome: (F) Repeated BrdU treatmentsand/or drug selection induce further H-type amplification including acentromere duplication, which leads to the formation of anotherheterochromatic chromosome arm. It is split off from the chromosome 7 bychromosome breakage, and by acquiring a terminal segment, the stablemegachromosome is formed.

[0040]FIG. 3 is a schematic diagram of the replicon structure and ascheme by which a megachromosome could be produced.

[0041]FIG. 4 sets forth the relationships among some of the exemplarycell lines described herein.

[0042]FIG. 5 is a diagram of the plasmid pTEMPUD.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0043] Definitions

[0044] Unless defined otherwise, all technical and scientific terms usedherein have the same meaning as is commonly understood by one of skillin the art to which this invention belongs. All patents and publicationsreferred to herein are incorporated by reference.

[0045] As used herein, a mammalian artificial chromosome [MAC] is apiece of DNA that can stably replicate and segregate alongsideendogenous chromosomes. It has the capacity to accommodate and expressheterologous genes inserted therein. It is referred to as a mammalianartificial chromosome because it includes an active mammaliancentromere(s). Plant artificial chromosomes, insect artificialchromosomes and avian artificial chromosomes refer to chromosomes thatinclude plant and insect centromeres, respectively. A human artificialchromosome [HAC] refers to chromosomes that include human centromeres,BUGACs refer to insect artificial chromosomes, and AVACs refer to avianartificial chromosomes. Among the MACs provided herein are SATACs,minichromosomes, and in vitro synthesized artificial chromosomes.Methods for construction of each type are provided herein.

[0046] As used herein, in vitro synthesized artificial chromosomes areartificial chromosomes that is produced by joining the essentialcomponents (at least the centromere, and origins of replication) invitro.

[0047] As used herein, endogenous chromosomes refer to genomicchromosomes as found in the cell prior to generation or introduction ofa MAC.

[0048] As used herein, stable maintenance of chromosomes occurs when atleast about 85%, preferably 90%, more preferably 95%, of the cellsretain the chromosome. Stability is measured in the presence of aselective agent. Preferably these chromosomes are also maintained in theabsence of a selective agent. Stable chromosomes also retain theirstructure during cell culturing, suffering neither intrachromosomal norinterchromosomal rearrangements.

[0049] As used herein, growth under selective conditions means growth ofa cell under conditions that require expression of a selectable markerfor survival.

[0050] As used herein, an agent that destabilizes a chromosome is anyagent known by those of skill in the art to enhance amplificationevents, mutations. Such agents, which include BrdU, are well known tothose of skill in the art.

[0051] As used herein, de novo with reference to a centromere, refers togeneration of an excess centromere as a result of incorporation of aheterologous DNA fragment using the methods herein.

[0052] As used herein, euchromatin and heterochromatin have theirrecognized meanings, euchromatin refers to chromatin that stainsdiffusely and that typically contains genes, and heterochromatin refersto chromatin that remains unusually condensed and that has been thoughtto be transcriptionally inactive. Highly repetitive DNA sequences[satellite DNA], at least with respect to mammalian cells, are usuallylocated in regions of the heterochromatin surrounding the centromere[pericentric heterochromatin]. Constitutive heterochromatin refers toheterochromatin that contains the highly repetitive DNA which isconstitutively condensed and genetically inactive.

[0053] As used herein, BrdU refers to 5-bromodeoxyuridine, which duringreplication is inserted in place of thymidine. BrdU is used as amutagen; it also inhibits condensation of metaphase chromosomes duringcell division.

[0054] As used herein, a dicentric chromosome is a chromosome thatcontains two centromeres. A multicentric chromosome contains more thantwo centromeres.

[0055] As used herein, a formerly dicentric chromosome is a chromosomethat is produced when a dicentric chromosome fragments and acquires newtelomeres so that two chromosomes, each having one of the centromeres,are produced. Each of the fragments are replicable chromosomes. If oneof the chromosomes undergoes amplification of euchromatic DNA to producea fully functional chromosome that contains the newly introducedheterologous DNA and primarily [at least more than 50%] euchromatin, itis a minichromosome. The remaining chromosome is a formerly dicentricchromosome. If one of the chromosomes undergoes amplification, wherebyheterochromatin [satellite DNA] is amplified and a euchromatic portion[or arm] remains, it is referred to as a sausage chromosome. Achromosome that is substantially all heterochromatin, except forportions of heterologous DNA, is called a SATAC. Such chromosomes[SATACs] can be produced from sausage chromosomes by culturing the cellcontaining the sausage chromosome under conditions, such as BrdUtreatment and/or growth under selective conditions, that destabilize thechromosome so that a satellite artificial chromosomes [SATAC] isproduced. For purposes herein, it is understood that SATACs may notnecessarily be produced in multiple steps, but may appear after theinitial introduction of the heterologous DNA and growth under selectiveconditions, or they may appear after several cycles of growth underselective conditions and BrdU treatment.

[0056] As used herein, a SATAC refers to a chromosome that issubstantially all heterochromatin, except for portions of heterologousDNA. Typically, SATACs are satellite DNA based artificial chromosomes,but the term encompasses any chromosome made by the methods herein thatcontains more heterochromatin than euchromatin.

[0057] As used herein, amplifiable, when used in reference to achromosome, particularly the method of generating SATACs providedherein, refers to a region of a chromosome that is prone toamplification. Amplifcation typically occurs during replication andother cellular events involving recombination. Such regions aretypically regions of the chromosome that include tandem repeats, such assatellite DNA, rDNA and other such sequences.

[0058] As used herein, amplification, with reference to DNA, is aprocess in which segments of DNA are duplicated to yield two or multiplecopies of identical or nearly identical DNA segments that are typicallyjoined as substantially tandem or successive repeats or invertedrepeats.

[0059] As used herein an amplicon is a repeated DNA amplification unitthat contains a set of inverted repeats of the megareplicon. Amegareplicon represents a higher order replication unit. For example,with reference to the SATACs, the megareplicon contains a set of tandemDNA blocks each containing satellite DNA flanked by non-satellite DNA.Contained within the megareplicon is a primary replication site,referred to as the megareplicator, which may be involved in organizingand facilitating replication of the pericentric heterochromatin andpossibly the centromeres. Within the megareplicon there may be smaller[e.g., 50-300 kb in some mammalian cells] secondary replicons. In theexemplified SATACS, the megareplicon is defined by two tandem ˜7.5 MbDNA blocks [see, e.g., FIG. 3]. Within each artificial chromosome [AC]or among a population thereof, each amplicon has the same grossstructure but may contain sequence variations. Such variations willarise as a result of movement of mobile genetic elements, deletions orinsertions or mutations that arise, particularly in culture. Suchvariation does not affect the use of the ACs or their overall structureas described herein.

[0060] As used herein, ribosomal RNA [rRNA] is the specialized RNA thatforms part of the structure of a ribosome and participates in thesynthesis of proteins. Ribosomal RNA is produced by transcription ofgenes which, in eukaryotic cells, are present in multiple copies. Inhuman cells, the approximately 250 copies of rRNA genes per haploidgenome are spread out in clusters on at least five different chromosomes(chromosomes 13, 14, 15, 21 and 22). In mouse cells, the presence ofribosomal DNA [rDNA] has been verified on at least 11 pairs out of 20mouse chromosomes [chromosomes 5, 6, 9, 11, 12, 15, 16, 17, 18, 19 andX][see e.g., Rowe et al. (1996) Mamm. Genome 7:886-889 and Johnson etal. (1993) Mamm. Genome 4:49-52]. In eukaryotic cells, the multiplecopies of the highly conserved rRNA genes are located in a tandemlyarranged series of rDNA units, which are generally about 40-45 kb inlength and contain a transcribed region and a nontranscribed regionknown as spacer (i.e., intergenic spacer) DNA which can vary in lengthand sequence. In the human and mouse, these tandem arrays of rDNA unitsare located adjacent to the pericentric satellite DNA sequences(heterochromatin). The regions of these chromosomes in which the rDNA islocated are referred to as nucleolar organizing regions (NOR) which loopinto the nucleolus, the site of ribosome production within the cellnucleus.

[0061] As used herein, the minichromosome refers to a chromosome derivedfrom a multicentric, typically dicentric, chromosome [see, e.g., FIG. 1]that contains more euchromatic than heterochromatic DNA.

[0062] As used herein, a megachromosome refers to a chromosome that,except for introduced heterologous DNA, is substantially composed ofheterochromatin. Megachromosomes are made of an array of repeatedamplicons that contain two inverted megareplicons bordered by introducedheterologous DNA [see, e.g., FIG. 3 for a schematic drawing of amegachromosome]. For purposes herein, a megachromosome is about 50 to400 Mb, generally about 250-400 Mb. Shorter variants are also referredto as truncated megachromosomes [about 90 to 120 or 150 Mb], dwarfmegachromosomes [˜150-200 Mb] and cell lines, and a micro-megachromosome[˜50-90 Mb, typically 50-60 Mb]. For purposes herein, the termmegachromosome refers to the overall repeated structure based on anarray of repeated chromosomal segments [amplicons] that contain twoinverted megareplicons bordered by any inserted heterologous DNA. Thesize will be specified.

[0063] As used herein, genetic therapy involves the transfer orinsertion of heterologous DNA into certain cells, target cells, toproduce specific gene products that are involved in correcting ormodulating disease. The DNA is introduced into the selected target cellsin a manner such that the heterologous DNA is expressed and a productencoded thereby is produced. Alternatively, the heterologous DNA may insome manner mediate expression of DNA that encodes the therapeuticproduct. It may encode a product, such as a peptide or RNA, that in somemanner mediates, directly or indirectly, expression of a therapeuticproduct. Genetic therapy may also be used to introduce therapeuticcompounds, such as TNF, that are not normally produced in the host orthat are not produced in therapeutically effective amounts or at atherapeutically useful time. Expression of the heterologous DNA by thetarget cells within an organism afflicted with the disease therebyenables modulation of the disease. The heterologous DNA encoding thetherapeutic product may be modified prior to introduction into the cellsof the afflicted host in order to enhance or otherwise alter the productor expression thereof.

[0064] As used herein, heterologous or foreign DNA and RNA are usedinterchangeably and refer to DNA or RNA that does not occur naturally aspart of the genome in which it is present or which is found in alocation or locations in the genome that differ from that in which itoccurs in nature. It is DNA or RNA that is not endogenous to the celland has been exogenously introduced into the cell. Examples ofheterologous DNA include, but are not limited to, DNA that encodes agene product or gene product(s) of interest, introduced for purposes ofgene therapy or for production of an encoded protein. Other examples ofheterologous DNA include, but are not limited to, DNA that encodestraceable marker proteins, such as a protein that confers drugresistance, DNA that encodes therapeutically effective substances, suchas anti-cancer agents, enzymes and hormones, and DNA that encodes othertypes of proteins, such as antibodies. Antibodies that are encoded byheterologous DNA may be secreted or expressed on the surface of the cellin which the heterologous DNA has been introduced.

[0065] As used herein, a therapeutically effective product is a productthat is encoded by heterologous DNA that, upon introduction of the DNAinto a host, a product is expressed that effectively ameliorates oreliminates the symptoms, manifestations of an inherited or acquireddisease or that cures said disease.

[0066] As used herein, transgenic plants refer to plants in whichheterologous or foreign DNA is expressed or in which the expression of agene naturally present in the plant has been altered.

[0067] As used herein, operative linkage of heterologous DNA toregulatory and effector sequences of nucleotides, such as promoters,enhancers, transcriptional and translational stop sites, and othersignal sequences refers to the relationship between such DNA and suchsequences of nucleotides. For example, operative linkage of heterologousDNA to a promoter refers to the physical relationship between the DNAand the promoter such that the transcription of such DNA is initiatedfrom the promoter by an RNA polymerase that specifically recognizes,binds to and transcribes the DNA in reading frame. Preferred promotersinclude tissue specific promoters, such as mammary gland specificpromoters, viral promoters, such TK, CMV, adenovirus promoters, andother promoters known to those of skill in the art.

[0068] As used herein, isolated, substantially pure DNA refers to DNAfragments purified according to standard techniques employed by thoseskilled in the art, such as that found in Maniatis et al. [(1982)Molecular Cloning: A Laboratory Manual, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.].

[0069] As used herein, expression refers to the process by which nucleicacid is transcribed into mRNA and translated into peptides,polypeptides, or proteins. If the nucleic acid is derived from genomicDNA, expression may, if an appropriate eukaryotic host cell or organismis selected, include splicing of the mRNA.

[0070] As used herein, vector or plasmid refers to discrete elementsthat are used to introduce heterologous DNA into cells for eitherexpression of the heterologous DNA or for replication of the clonedheterologous DNA. Selection and use of such vectors and plasmids arewell within the level of skill of the art.

[0071] As used herein, transformation/transfection refers to the processby which DNA or RNA is introduced into cells. Transfection refers to thetaking up of exogenous nucleic acid, e.g., an expression vector, by ahost cell whether or not any coding sequences are in fact expressed.Numerous methods of transfection are known to the ordinarily skilledartisan, for example, by direct uptake using calcium phosphate [CaPO4;see, e.g., Wigler et al. (1979) Proc. Natl. Acad. Sci. U.S.A.76:1373-1376], polyethylene glycol [PEG]-mediated DNA uptake,electroporation, lipofection [see, e.g., Strauss (1996) Meth. Mol. Biol.54:307-327], microcell fusion [see, EXAMPLES, see, also Lambert (1991)Proc. Natl. Acad. Sci. U.S.A. 88:5907-5911; U.S. Pat. No. 5,396,767,Sanford et al. (1987) Somatic Cell Mol. Genet. 13:279-284; Dhar et al.(1984) Somatic Cell Mol. Genet. 10:547-559; and McNeill-Killary et al.(1995) Meth. Enzymol. 254:133-152], lipid-mediated carrier systems [see,e.g., Teifel et al. (1995) Biotechniques 19:79-80; Albrecht et al.(1996) Ann. Hematol. 72:73-79; Holmen et al. (1995) In Vitro Cell Dev.Biol. Anim. 31:347-351; REmy et al. (1994) Bioconjug. Chem. 5:647-654;Le Boich et al. (1995) Tetrahedron Lett. 36:6681-6684; Loeffler et al.(1993) Meth. Enzymol. 217:599-618] or other suitable method. Successfultransfection is generally recognized by detection of the presence of theheterologous nucleic acid within the transfected cell, such as anyindication of the operation of a vector within the host cell.Transformation means introducing DNA into an organism so that the DNA isreplicable, either as an extrachromosomal element or by chromosomalintegration.

[0072] As used herein, injected refers to the microinjection [use of asmall syringe] of DNA into a cell.

[0073] As used herein, substantially homologous DNA refers to DNA thatincludes a sequence of nucleotides that is sufficiently similar toanother such sequence to form stable hybrids under specified conditions.

[0074] It is well known to those of skill in this art that nucleic acidfragments with different sequences may, under the same conditions,hybridize detectably to the same “target” nucleic acid. Two nucleic acidfragments hybridize detectably, under stringent conditions over asufficiently long hybridization period, because one fragment contains asegment of at least about 14 nucleotides in a sequence which iscomplementary [or nearly complementary] to the sequence of at least onesegment in the other nucleic acid fragment. If the time during whichhybridization is allowed to occur is held constant, at a value duringwhich, under preselected stringency conditions, two nucleic acidfragments with exactly complementary base-pairing segments hybridizedetectably to each other, departures from exact complementarity can beintroduced into the base-pairing segments, and base-pairing willnonetheless occur to an extent sufficient to make hybridizationdetectable. As the departure from complementarity between thebase-pairing segments of two nucleic acids becomes larger, and asconditions of the hybridization become more stringent, the probabilitydecreases that the two segments will hybridize detectably to each other.

[0075] Two single-stranded nucleic acid segments have “substantially thesame sequence,” within the meaning of the present specification, if (a)both form a base-paired duplex with the same segment, and (b) themelting temperatures of said two duplexes in a solution of 0.5×SSPEdiffer by less than 10° C. If the segments being compared have the samenumber of bases, then to have “substantially the same sequence”, theywill typically differ in their sequences at fewer than 1 base in 10.Methods for determining melting temperatures of nucleic acid duplexesare well known [see, e.g., Meinkoth and Wahl (1984) Anal. Biochem.138:267-284 and references cited therein].

[0076] As used herein, a nucleic acid probe is a DNA or RNA fragmentthat includes a sufficient number of nucleotides to specificallyhybridize to DNA or RNA that includes identical or closely relatedsequences of nucleotides. A probe may contain any number of nucleotides,from as few as about 10 and as many as hundreds of thousands ofnucleotides. The conditions and protocols for such hybridizationreactions are well known to those of skill in the art as are the effectsof probe size, temperature, degree of mismatch, salt concentration andother parameters on the hybridization reaction. For example, the lowerthe temperature and higher the salt concentration at which thehybridization reaction is carried out, the greater the degree ofmismatch that may be present in the hybrid molecules.

[0077] To be used as a hybridization probe, the nucleic acid isgenerally rendered detectable by labelling it with a detectable moietyor label, such as ³²P, ³H and ¹⁴C, or by other means, including chemicallabelling, such as by nick-translation in the presence of deoxyuridylatebiotinylated at the 5′-position of the uracil moiety. The resultingprobe includes the biotinylated uridylate in place of thymidylateresidues and can be detected [via the biotin moieties] by any of anumber of commercially available detection systems based on binding ofstreptavidin to the biotin. Such commercially available detectionsystems can be obtained, for example, from Enzo Biochemicals, Inc. [NewYork, N.Y.]. Any other label known to those of skill in the art,including non-radioactive labels, may be used as long as it renders theprobes sufficiently detectable, which is a function of the sensitivityof the assay, the time available [for culturing cells, extracting DNA,and hybridization assays], the quantity of DNA or RNA available as asource of the probe, the particular label and the means used to detectthe label.

[0078] Once sequences with a sufficiently high degree of homology to theprobe are identified, they can readily be isolated by standardtechniques, which are described, for example, by Maniatis et al. ((1982)Molecular Cloning: A Laboratory Manual, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.).

[0079] As used herein, conditions under which DNA molecules form stablehybrids and are considered substantially homologous are such that DNAmolecules with at least about 60% complementarity form stable hybrids.Such DNA fragments are herein considered to be “substantiallyhomologous”. For example, DNA that encodes a particular protein issubstantially homologous to another DNA fragment if the DNA forms stablehybrids such that the sequences of the fragments are at least about 60%complementary and if a protein encoded by the DNA retains its activity.

[0080] For purposes herein, the following stringency conditions aredefined:

[0081] 1) high stringency: 0.1×SSPE, 0.1% SDS, 65° C.

[0082] 2) medium stringency: 0.2×SSPE, 0.1% SDS, 50° C.

[0083] 3) low stringency: 1.0×SSPE, 0.1% SDS, 50° C.

[0084] or any combination of salt and temperature and other reagentsthat result in selection of the same degree of mismatch or matching.

[0085] As used herein, immunoprotective refers to the ability of avaccine or exposure to an antigen or immunity-inducing agent, to conferupon a host to whom the vaccine or antigen is administered orintroduced, the ability to resist infection by a disease-causingpathogen or to have reduced symptoms. The selected antigen is typicallyan antigen that is presented by the pathogen.

[0086] As used herein, all assays and procedures, such as hybridizationreactions and antibody-antigen reactions, unless otherwise specified,are conducted under conditions recognized by those of skill in the artas standard conditions.

[0087] A. Preparation of cell lines containing MACs

[0088] 1. The megareplicon

[0089] The methods, cells and MACs provided herein are produced byvirtue of the discovery of the existence of a higher-order replicationunit [megareplicon] of the centromeric region. This megareplicon isdelimited by a primary replication initiation site [megareplicator], andappears to facilitate replication of the centromeric heterochromatin,and most likely, centromeres. Integration of heterologous DNA into themegareplicator region or in close proximity thereto, initiates alarge-scale amplification of megabase-size chromosomal segments, whichleads to de novo chromosome formation in living cells.

[0090] DNA sequences that provide a preferred megareplicator are therDNA units that give rise to ribosomal RNA (rRNA). In mammals,particularly mice and humans, these rDNA units contain specializedelements, such as the origin of replication (or origin of bidirectionalreplication, i.e., OBR, in mouse) and amplification promoting sequences(APS) and amplification control elements (ACE) (see, e.g., Gogel et al.(1996) Chromosoma 104:511-518; Coffman et al. (1993) Exp. Cell. Res.209:123-132; Little et al. (1993) Mol. Cell. Biol. 13:6600-6613; Yoon etal. (1995) Mol. Cell. Biol. 15:2482-2489; Gonzalez and Sylvester (1995)Genomics 27:320-328; Miesfeld and Arnheim (1982) Nuc. Acids Res.10:3933-3949); Maden et al. (1987) Biochem. J. 246:519-527).

[0091] As described herein, without being bound by any theory, thesespecialized elements may facilitate replication and/or amplification ofmegabase-size chromosomal segments in the de novo formation ofchromosomes, such as those described herein, in cells. These specializedelements are typically located in the nontranscribed intergenic spacerregion upstream of the transcribed region of rDNA. The intergenic spacerregion may itself contain internally repeated sequences which can beclassified as tandemly repeated blocks and nontandem blocks (see e.g.,Gonzalez and Sylvester (1995) Genomics 27:320-328). In mouse rDNA, anorigin of bidirectional replication may be found within a 3-kbinitiation zone centered approximately 1.6 kb upstream of thetranscription start site (see, e.g., Gogel et al. (1996) Chromosoma104:511-518). The sequences of these specialized elements tend to havean altered chromatin structure, which may be detected, for example, bynuclease hypersensitivity or the presence of AT-rich regions that cangive rise to bent DNA structures. An exemplary sequence encompassing anorigin of replication is shown in SEQ ID NO. 16 and in GENBANK accessionno. X82564 at about positions 2430-5435. Exemplary sequencesencompassing amplification-promoting sequences include nucleotides690-1060 and 1105-1530 of SEQ ID NO. 16.

[0092] In human rDNA, a primary replication initiation site may be founda few kilobase pairs upstream of the transcribed region and secondaryinitiation sites may be found throughout the nontranscribed intergenicspacer region (see, e.g., Yoon et al. (1995) Mol. Cell. Biol.15:2482-2489). A complete human rDNA repeat unit is presented in GENBANKas accession no. U13369 and is set forth in SEQ ID NO. 17 herein.Another exemplary sequence encompassing a replication initiation sitemay be found within the sequence of nucleotides 35355-42486 in SEQ IDNO. 17 particularly within the sequence of nucleotides 37912-42486 andmore particularly within the sequence of nucleotides 37912-39288 of SEQID NO. 17 (see Coffman et al. (1993) Exp. Cell. Res. 209:123-132).

[0093] Cell lines containing MACs can be prepared by transforming cells,preferably a stable cell line, with a heterologous DNA fragment thatencodes a selectable marker, culturing under selective conditions, andidentifying cells that have a multicentric, typically dicentric,chromosome. These cells can then be manipulated as described herein toproduce the minichromosomes and other MACs, particularly theheterochromatic SATACs, as described herein.

[0094] Development of a multicentric, particularly dicentric, chromosometypically is effected through integration of the heterologous DNA in thepericentric heterochromatin, preferably in the centromeric regions ofchromosomes carrying rDNA sequences. Thus, the frequency ofincorporation can be increased by targeting to these regions, such as byincluding DNA, including, but not limited to, rDNA or satellite DNA, inthe heterologous fragment that encodes the selectable marker. Among thepreferred targeting sequences for directing the heterologous DNA to thepericentromeric heterochromatin are rDNA sequences that targetcentromeric regions of chromosomes that carry rRNA genes. Such sequencesinclude, but are not limited to, the DNA of SEQ ID NO. 16 and GENBANKaccession no. X82564 and portions thereof, the DNA of SEQ ID NO. 17 andGENBANK accession no. U13369 and portions thereof, and the DNA of SEQ IDNOS. 18-24. A particular vector incorporating from within SEQ ID NO. 16for use in directing integration of heterologous DNA into chromosomalrDNA is pTERPUD (see Example 12). Satellite DNA sequences can also beused to direct the heterologous DNA to integrate into the pericentricheterochromatin. For example, vectors pTEMPUD and pHASPUD, which containmouse and human satellite DNA, respectively, are provided herein (seeExample 12) as exemplary vectors for introduction of heterologous DNAinto cells for de novo artificial chromosome formation.

[0095] The resulting cell lines can then be treated as the exemplifiedcells herein to produce cells in which the dicentric chromosome hasfragmented. The cells can then be used to introduce additional selectivemarkers into the fragmented dicentric chromosome (i.e., formerlydicentric chromosome), whereby amplification of the pericentricheterochromatin will produce the heterochromatic chromosomes.

[0096] The following discussion describes this process with reference tothe EC3/7 line and the resulting cells. The same procedures can beapplied to any other cells, particularly cell lines to create SATACs andeuchromatic minichromosomes.

[0097] 2. Formation of de novo chromosomes

[0098] De novo centromere formation in a transformed mouseLMTK-fibro-blast cell line [EC3/7] after cointegration of λ constructs[λCM8 and λgtWESneo] carrying human and bacterial DNA [Hadlaczky et al.(1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-8110 and U.S. applicationSer. No. 08/375,271] has been shown. The integration of the“heterologous” engineered human, bacterial and phage DNA, and thesubsequent amplification of mouse and heterologous DNA that led to theformation of a dicentric chromosome, occurred at the centromeric regionof the short arm of a mouse chromosome. By G-banding, this chromosomewas identified as mouse chromosome 7. Because of the presence of twofunctionally active centromeres on the same chromosome, regularbreakages occur between the centromeres. Such specific chromosomebreakages gave rise to the appearance [in approximately 1 0% of thecells] of a chromosome fragment carrying the neo-centromere. From theEC3/7 cell line [see, U.S. Pat. No. 5,288,625, deposited at the EuropeanCollection of Animal Cell Culture (hereinafter ECACC) under accessionno. 90051001; see, also Hadlaczky et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:8106-8110, and U.S. application Ser. No. 08/375,271 and thecorresponding published European application EP 0 473 253, two sublines[EC3/7C5 and EC3/7C6] were selected by repeated single-cell cloning. Inthese cell lines, the neo-centromere was found exclusively on aminichromosome [neo-minichromosome], while the formerly dicentricchromosome carried traces of “heterologous” DNA.

[0099] It has now been discovered that integration of DNA encoding aselectable marker in the heterochromatic region of the centromere led toformation of the dicentric chromosome.

[0100] 3. The neo-minichromosome

[0101] The chromosome breakage in the EC3/7 cells, which separates theneo-centromere from the mouse chromosome, occurred in the G-bandpositive “heterologous” DNA region. This is supported by the observationof traces of λ and human DNA sequences at the broken end of the formerlydicentric chromosome. Comparing the G-band pattern of the chromosomefragment carrying the neo-centromere with that of the stableneo-minichromosome, it is apparent that the neo-minichromosome is aninverted duplicate of the chromosome fragment that bears theneo-centromere. This is supported by the observation that although theneo-minichromosome carries only one functional centromere, both ends ofthe minichromosome are heterochromatic, and mouse satellite DNAsequences were found in these heterochromatic regions by in situhybridization.

[0102] Mouse cells containing the minichromosome, which containsmultiple repeats of the heterologous DNA, which in the exemplifiedembodiment is λ DNA and the neomycin-resistance gene, can be used asrecipient cells in cell transformation. Donor DNA, such as selectedheterologous DNA containing λ DNA linked to a second selectable marker,such as the gene encoding hygromycin phosphotransferase which confershygromycin resistance [hyg], can be introduced into the mouse cells andintegrated into the minichromosomes by homologous recombination of λ DNAin the donor DNA with that in the minichromosomes. Integration isverified by in situ hybridization and Southern blot analyses.Transcription and translation of the heterologous DNA is confirmed byprimer extension and immunoblot analyses.

[0103] For example, DNA has been targeted into the neo-minichromosome inEC3/7C5 cells using a λ DNA-containing construct [pNem1 ruc] that alsocontains DNA encoding hygromycin resistance and the Renilla luciferasegene linked to a promoter, such as the cytomegalovirus [CMV] earlypromoter, and the bacterial neomycin resistance-encoding DNA.Integration of the donor DNA into the chromosome in selected cells[designated PHN4] was confirmed by nucleic acid amplification [PCR] andin situ hybridization. Events that would produce a neo-minichromosomeare depicted in FIG. 1.

[0104] The resulting engineered minichromosome that contains theheterologous DNA can then be transferred by cell fusion into a recipientcell line, such as Chinese hamster ovary cells [CHO] and correctexpression of the heterologous DNA can be verified. Following productionof the cells, metaphase chromosomes are obtained, such as by addition ofcolchicine, and the chromosomes purified by addition of AT- andGC-specific dyes on a dual laser beam based cell sorter (see Example 10B for a description of methods of isolating artificial chromomsomes).Preparative amounts of chromosomes [5×10⁴-5×10⁷ chromosomes/ml] at apurity of 95% or higher can be obtained. The resulting chromosomes areused for delivery to cells by methods such as microinjection andliposome-mediated transfer.

[0105] Thus, the neo-minichromosome is stably maintained in cells,replicates autonomously, and permits the persistent long-term expressionof the neo gene under non-selective culture conditions. It also containsmegabases of heterologous known DNA [λ DNA in the exemplifiedembodiments] that serves as target sites for homologous recombinationand integration of DNA of interest. The neo-minichromosome is, thus, avector for genetic engineering of cells. It has been introduced intoSCID mice, and shown to replicate in the same manner as endogenouschromosomes.

[0106] The methods herein provide means to induce the events that leadto formation of the neo-minichromosome by introducing heterologous DNAwith a selective marker [preferably a dominant selectable marker] intocells and culturing the cells under selective conditions. As a result,cells that contain a multicentric, e.g., dicentric chromosome, orfragments thereof, generated by amplification are produced. Cells withthe dicentric chromosome can then be treated to destabilize thechromosomes with agents, such as BrdU and/or culturing under selectiveconditions, resulting in cells in which the dicentric chromosome hasformed two chromosomes, a so-called minichromosome, and a formerlydicentric chromosome that has typically undergone amplification in theheterochromatin where the heterologous DNA has integrated to produce aSATAC or a sausage chromosome [discussed below]. These cells can befused with other cells to separate the minichromosome from the formerlydicentric chromosome into different cells so that each type of MAC canbe manipulated separately.

[0107] 4. Preparation of SATACs

[0108] An exemplary protocol for preparation of SATACs is illustrated inFIG. 2 [particularly D, E and F] and FIG. 3 [see, also the EXAMPLES,particularly EXAMPLES 4-7].

[0109] To prepare a SATAC, the starting materials are cells, preferablya stable cell line, such as a fibroblast cell line, and a DNA fragmentthat includes DNA that encodes a selective marker. The DNA fragment isintroduced into the cell by methods of DNA transfer, including but notlimited to direct uptake using calcium phosphate, electroporation, andlipid-mediated transfer. To insure integration of the DNA fragment inthe heterochromatin, it is preferable to start with DNA that will betargeted to the pericentric heterochromatic region of the chromosome,such as λCM8 and vectors provided herein, such as pTEMPUD [FIG. 5] andpHASPUD (see Example 12) that include satellite DNA, or specificallyinto rDNA in the centromeric regions of chromosomes containing rDNAsequences. After introduction of the DNA, the cells are grown underselective conditions. The resulting cells are examined and any that havemulticentric, particularly dicentric, chromosomes [or heterochromaticchromosomes or sausage chromosomes or other such structure; see, FIG.2D, 2E and 2F] are selected.

[0110] In particular, if a cell with a dicentric chromosome is selected,it can be grown under selective conditions, or, preferably, additionalDNA encoding a second selectable marker is introduced, and the cellsgrown under conditions selective for the second marker. The resultingcells should include chromosomes that have structures similar to thosedepicted in FIGS. 2D, 2E, 2F. Cells with a structure, such as thesausage chromosome, FIG. 2D, can be selected and fused with a secondcell line to eliminate other chromosomes that are not of interest. Ifdesired, cells with other chromosomes can be selected and treated asdescribed herein. If a cell with a sausage chromosome is selected, itcan be treated with an agent, such as BrdU, that destabilizes thechromosome so that the heterochromatic arm forms a chromosome that issubstantially heterochromatic [i.e., a megachromosome, see, FIG. 2F].Structures such as the gigachromosome in which the heterochromatic armhas amplified but not broken off from the euchromatic arm, will also beobserved. The megachromosome is a stable chromosome. Furthermanipulation, such as fusions and growth in selective conditions and/orBrdU treatment or other such treatment, can lead to fragmentation of themegachromosome to form smaller chromosomes that have the amplicon as thebasic repeating unit.

[0111] The megachromosome can be further fragmented in vivo using achromosome fragmentation vector, such as pTEMPUD [see, FIG. 5 andEXAMPLE 12], pHASPUD or pTERPUD (see Example 12) to ultimately produce achromosome that comprises a smaller stable replicable unit, about 15Mb-60 Mb, containing one to four megareplicons.

[0112] Thus, the stable chromosomes formed de novo that originate fromthe short arm of mouse chromosome 7 have been analyzed. This chromosomeregion shows a capacity for amplification of large chromosome segments,and promotes de novo chromosome formation. Large-scale amplification atthe same chromosome region leads to the formation of dicentric andmulticentric chromosomes, a minichromosome, the 150-200 Mb size Aneo-chromosome, the “sausage” chromosome, the 500-1000 Mbgigachromosome, and the stable 250-400 Mb megachromosome.

[0113] A clear segmentation is observed along the arms of themegachromosome, and analyses show that the building units of thischromosome are amplicons of ˜30 Mb composed of mouse major satellite DNAwith the integrated “foreign” DNA sequences at both ends. The ˜30 Mbamplicons are composed of two ˜15 Mb inverted doublets of ˜7.5 Mb mousemajor satellite DNA blocks, which are separated from each other by anarrow band of non-satellite sequences [see, e.g., FIG. 3]. The widernon-satellite regions at the amplicon borders contain integrated,exogenous [heterologous] DNA, while the narrow bands of non-satelliteDNA sequences within the amplicons are integral parts of the pericentricheterochromatin of mouse chromosomes. These results indicate that the˜7.5 Mb blocks flanked by non-satellite DNA are the building units ofthe pericentric heterochromatin of mouse chromosomes, and the ˜15 Mbsize pericentric regions of mouse chromosomes contain two ˜7.5 Mb units.

[0114] Apart from the euchromatic terminal segments, the wholemegachromosome is heterochromatic, and has structural homogeneity.Therefore, this large chromosome offers a unique possibility forobtaining information about the amplification process, and for analyzingsome basic characteristics of the pericentric constitutiveheterochromatin, as a vector for heterologous DNA, and as a target forfurther fragmentation.

[0115] As shown herein, this phenomenon is generalizable and can beobserved with other chromosomes. Also, although these de novo formedchromosome segments and chromosomes appear different, there aresimilarities that indicate that a similar amplification mechanism playsa role in their formation: (i) in each case, the amplification isinitiated in the centromeric region of the mouse chromosomes and large(Mb size) amplicons are formed; (ii) mouse major satellite DNA sequencesare constant constituents of the amplicons, either by providing the bulkof the heterochromatic amplicons [H-type amplification], or by borderingthe euchromatic amplicons [E-type-amplification]; (iii) formation ofinverted segments can be demonstrated in the A neo-chromosome andmegachromosome; (iv) chromosome arms and chromosomes formed by theamplification are stable and functional.

[0116] The presence of inverted chromosome segments seems to be a commonphenomenon in the chromosomes formed de novo at the centromeric regionof mouse chromosome 7. During the formation of the neo-minichromosome,the event leading to the stabilization of the distal segment of mousechromosome 7 that bears the neo-centromere may have been the formationof its inverted duplicate. Amplicons of the megachromosome are inverteddoublets of ˜7.5 Mb mouse major satellite DNA blocks.

[0117] 5. Cell lines

[0118] Cell lines that contain MACs, such as the minichromosome, theλ-neo chromosome, and the SATACs are provided herein or can be producedby the methods herein. Such cell lines provide a convenient source ofthese chromosomes and can be manipulated, such as by cell fusion orproduction of microcells for fusion with selected cell lines, to deliverthe chromosome of interest into hybrid cell lines. Exemplary cell linesare described herein and some have been deposited with the ECACC.

[0119] a. EC3/7C5 and EC3/7C6

[0120] Cell lines EC3/7C5 and EC3/7C6 were produced by single cellcloning of EC3/7. For exemplary purposes EC3/7C5 has been deposited withthe ECACC. These cell lines contain a minichromosome and the formerlydicentric chromosome from EC3/7. The stable minichromosomes in celllines EC3/7C5 and EC3/7C6 appear to be the same and they seem to beduplicated derivatives of the ˜10-15 Mb “broken-off” fragment of thedicentric chromosome. Their similar size in these independentlygenerated cell lines might indicate that ˜20-30 Mb is the minimal orclose to the minimal physical size for a stable minichromosome.

[0121] b. TF1004G19

[0122] Introduction of additional heterologous DNA, including DNAencoding a second selectable marker, hygromycin phosphotransferase,i.e., the hygromycin-resistance gene, and also a detectable marker,β-galactosidase (i.e., encoded by the lacZ gene), into the EC3/7C5 cellline and growth under selective conditions produced cells designatedTF1004G19. In particular, this cell line was produced from the EC3/7C5cell line by cotransfection with plasmids pH132, which contains ananti-HIV ribozyme and hygromycin-resistance gene, pCH110 [encodesβ-galactosidase] and λ phage [λcl 857 Sam 7] DNA and selection withhygromycin B.

[0123] Detailed analysis of the TF1004G19 cell line by in situhybridization with λ phage and plasmid DNA sequences revealed theformation of the sausage chromosome. The formerly dicentric chromosomeof the EC3/7C5 cell line translocated to the end of another acrocentricchromosome. The heterologous DNA integrated into the pericentricheterochromatin of the formerly dicentric chromosome and is amplifiedseveral times with megabases of mouse pericentric heterochromaticsatellite DNA sequences [FIG. 2D] forming the “sausage” chromosome.Subsequently the acrocentric mouse chromosome was substituted by aeuchromatic telomere.

[0124] In situ hybridization with biotin-labeled subfragments of thehygromycin-resistance and β-galactosidase genes resulted in ahybridization signal only in the heterochromatic arm of the sausagechromosome, indicating that in TF1004G19 transformant cells these genesare localized in the pericentric heterochromatin.

[0125] A high level of gene expression, however, was detected. Ingeneral, heterochromatin has a silencing effect in Drosophila, yeast andon the HSV-tk gene introduced into satellite DNA at the mousecentromere. Thus, it was of interest to study the TF1004G19 transformedcell line to confirm that genes located in the heterochromatin wereindeed expressed, contrary to recognized dogma.

[0126] For this purpose, subclones of TF1004G19, containing a differentsausage chromosome [see FIG. 2D], were established by single cellcloning. Southern hybridization of DNA isolated from the subclones withsubfragments of hygromycin phosphotransferase and lacZ genes showed aclose correlation between the intensity of hybridization and the lengthof the sausage chromosome. This finding supports the conclusion thatthese genes are localized in the heterochromatic arm of the sausagechromosome.

[0127] (1) TF1004G-19C5

[0128] TF1004G-19C5 is a mouse LMTK- fibroblast cell line containingneo-minichromosomes and stable “sausage” chromosomes. It is a subcloneof TF1004G19 and was generated by single-cell cloning of the TF1004G19cell line. It has been deposited with the ECACC as an exemplary cellline and exemplary source of a sausage chromosome. Subsequent fusion ofthis cell line with CHO K20 cells and selection with hygromycin and G418and HAT (hypoxanthine, aminopteria, and thymidine medium; see Szybalskiet al. (1962) Proc. Natl. Acad. Sci. 48:2026) resulted in hybrid cells(designated 19C5xHa4) that carry the sausage chromosome and theneo-minichromosome. BrdU treatment of the hybrid cells, followed bysingle cell cloning and selection with G418 and/or hygromycin producedvarious cells that carry chromosomes of interest, including GB43 andG3D5.

[0129] (2) other subclones

[0130] Cell lines GB43 and G3D5 were obtained by treating 19C5xHa4 cellswith BrdU followed by growth in G418-containing selective medium andretreatment with BrdU. The two cell lines were isolated by single cellcloning of the selected cells GB43 cells contain the neo-minichromosomeonly. G3D5, which has been deposited with the ECACC, carries theneo-minichromosome and the megachromosome. Single cell cloning of thiscell line followed by growth of the subclones in G418- andhygromycin-containing medium yielded subclones such as the GHB42 cellline carrying the neo-minichromosome and the megachromosome. H1D3 is amouse-hamster hybrid cell line carrying the megachromosome, but noneo-minichromosome, and was generated by treating 19C5xHa4 cells withBrdU followed by growth in hygromycin-containing selective medium andsingle cell subcloning of selected cells. Fusion of this cell line withthe CD4⁺ HeLa cell line that also carries DNA encoding an additionalselection gene, the neomycin-resistance gene, produced cells [designatedH1xHE41 cells] that carry the megachromosome as well as a humanchromosome that carries CD4neo. Further BrdU treatment and single cellcloning produced cell lines, such as 1B3, that include cells with atruncated megachromosome.

[0131] 5. DNA constructs used to transform the cells

[0132] Heterologous DNA can be introduced into the cells by transfectionor other suitable method at any stage during preparation of thechromosomes [see, e.g., FIG. 4]. In general, incorporation of such DNAinto the MACs is assured through site-directed integration, such as maybe accomplished by inclusion of λ-DNA in the heterologous DNA (for theexemplified chromosomes), and also an additional selective marker gene.For example, cells containing a MAC, such as the minichromosome or aSATAC, can be cotransfected with a plasmid carrying the desiredheterologous DNA, such as DNA encoding an HIV ribozyme, the cysticfibrosis gene, and DNA encoding a second selectable marker, such ashygromycin resistance. Selective pressure is then applied to the cellsby exposing them to an agent that is harmful to cells that do notexpress the new selectable marker. In this manner, cells that includethe heterologous DNA in the MAC are identified. Fusion with a secondcell line can provide a means to produce cell lines that contain oneparticular type of chromosomal structure or MAC.

[0133] Various vectors for this purpose are provided herein [see,Examples] and others can be readily constructed. The vectors preferablyinclude DNA that is homologous to DNA contained within a MAC in order totarget the DNA to the MAC for integration therein. The vectors alsoinclude a selectable marker gene and the selected heterologous gene(s)of interest. Based on the disclosure herein and the knowledge of theskilled artisan, one of skill can construct such vectors.

[0134] Of particular interest herein is the vector pTEMPUD andderivatives thereof that can target DNA into the heterochromatic regionof selected chromosomes. These vectors can also serve as fragmentationvectors [see, e.g., Example 12].

[0135] Heterologous genes of interest include any gene that encodes atherapeutic product and DNA encoding gene products of interest. Thesegenes and DNA include, but are not limited to: the cystic fibrosis gene[CF], the cystic fibrosis transmembrane regulator (CFTR) gene [see,e.g., U.S. Pat. No. 5,240,846; Rosenfeld et al. (1992) Cell 68:143-155;Hyde et al. (1993) Nature 362: 250-255; Kerem et al. (1989) Science245:1073-1080; Riordan et al. (1989) Science 245:1066-1072; Rommens etal. (1989) Science 245:1059-1065; Osborne et al. (1991) Am. J. Hum.Genetics 48:6089-6122; White et al. (1990) Nature 344:665-667; Dean etal. (1990) Cell 61:863-870; Erlich et al. (1991) Science 252:1643; andU.S. Pat. Nos. 5,453,357, 5,449,604, 5,434,086, and 5,240,846, whichprovides a retroviral vector encoding the normal CFTR gene].

[0136] B. Isolation of artificial chromosomes

[0137] The MACs provided herein can be isolated by any suitable methodknown to those of skill in the art. Also, methods are provided hereinfor effecting substantial purification, particularly of the SATACs.SATACs have been isolated by fluorescence-activated cell sorting [FACS].This method takes advantage of the nucleotide base content of theSATACs, which, by virtue of their high heterochromatic DNA content, willdiffer from any other chromosomes in a cell. In particular embodiment,metaphase chromosomes are isolated and stained with base-specific dyes,such as Hoechst 33258 and chromomycin A3. Fluorescence-activated cellsorting will separate the SATACs from the endogenous chromosomes. Adual-laser cell sorter [FACS Vantage Becton Dickinson ImmunocytometrySystems] in which two lasers were set to excite the dyes separately,allowed a bivariate analysis of the chromosomes by base-pair compositionand size. Cells containing such SATACs can be similarly sorted.

[0138] Additional methods provided herein for isolation of artificialchromosomes from endogenous chromosomes include procedures that areparticularly well suited for large-scale isolation of artificialchromosomes such as SATACs. In these methods, the size and densitydifferences between SATACs and endogenous chromosomes are exploited toeffect separation of these two types of chromosomes. Such methodsinvolve techniques such as swinging bucket centrifugation, zonal rotorcentrifugation, and velocity sedimentation. Affinity-, particularlyimmunoaffinity-, based methods for separation of artificial fromendogenous chromosomes are also provided herein. For example, SATACs,which are predominantly heterochromatin, may be separated fromendogenous chromosomes through immunoaffinity procedures involvingantibodies that specifically recognize heterochromatin, and/or theproteins associated therewith, when the endogenous chromosomes containrelatively little heterochromatin, such as in hamster cells.

[0139] C. In vitro construction of artificial chromosomes

[0140] Artificial chromosomes can be constructed in vitro by assemblingthe structural and functional elements that contribute to a completechromosome capable of stable replication and segregation alongsideendogenous chromosomes in cells. The identification of the discreteelements that in combination yield a functional chromosome has madepossible the in vitro generation of artificial chromosomes. The processof in vitro construction of artificial chromosomes, which can be rigidlycontrolled, provides advantages that may be desired in the generation ofchromosomes that, for example, are required in large amounts or that areintended for specific use in transgenic animal systems.

[0141] For example, in vitro construction may be advantageous whenefficiency of time and scale are important considerations in thepreparation of artificial chromosomes. Because in vitro constructionmethods do not involve extensive cell culture procedures, they may beutilized when the time and labor required to transform, feed, cultivate,and harvest cells used in in vivo cell-based production systems isunavailable.

[0142] In vitro construction may also be rigorously controlled withrespect to the exact manner in which the several elements of the desiredartificial chromosome are combined and in what sequence and proportionsthey are assembled to yield a chromosome of precise specifications.These aspects may be of significance in the production of artificialchromosomes that will be used in live animals where it is desirable tobe certain that only very pure and specific DNA sequences in specificamounts are being introduced into the host animal.

[0143] The following describes the processes involved in theconstruction of artificial chromosomes in vitro, utilizing amegachromosome as exemplary starting material.

[0144] 1. Identification and isolation of the components of theartificial chromosome

[0145] The MACs provided herein, particularly the SATACs, are elegantlysimple chromosomes for use in the identification and isolation ofcomponents to be used in the in vitro construction of artificialchromosomes. The ability to purify MACs to a very high level of purity,as described herein, facilitates their use for these purposes. Forexample, the megachromosome, particularly truncated forms thereof [i.e.cell lines, such as 1B3 and mM2C1, which are derived from H1D3(deposited at the European Collection of Animal Cell Culture (ECACC)under Accession No. 96040929, see EXAMPLES below) serve as startingmaterials.

[0146] For example, the mM2C1 cell line contains a micro-megachromosome(˜50-60 kB), which advantageously contains only one centromere, tworegions of integrated heterologous DNA with adjacent rDNA sequences,with the remainder of the chromosomal DNA being mouse major satelliteDNA. Other truncated megachromosomes can serve as a source of telomeres,or telomeres can be provided (see, Examples below regarding constructionof plasmids containing tandemly repeated telomeric sequences). Thecentromere of the mM2C1 cell line contains mouse minor satellite DNA,which provides a useful tag for isolation of the centromeric DNA.

[0147] Additional features of particular SATACs provided herein, such asthe micro-megachromosome of the mM2C1 cell line, that make them uniquelysuited to serve as starting materials in the isolation andidentification of chromosomal components include the fact that thecentromeres of each megachromosome within a single specific cell lineare identical. The ability to begin with a homogeneous centromere source(as opposed to a mixture of different chromosomes having differingcentromeric sequences) greatly facilitates the cloning of the centromereDNA. By digesting purified megachromosomes, particularly truncatedmegachromosomes, such as the micro-megachromosome, with appropriaterestriction endonucleases and cloning the fragments into thecommercially available and well known YAC vectors (see, e.g., Burke etal. (1987) Science 236:806-812), BAC vectors (see, e.g., Shizuya et al.(1992) Proc. Natl. Acad. Sci. U.S.A. 89: 8794-8797 bacterial artificialchromosomes which have a capacity of incorporating 0.9-1 Mb of DNA) orPAC vectors (the P1 artificial chromosome vector which is a P1 plasmidderivative that has a capacity of incorporating 300 kb of DNA and thatis delivered to E. coli host cells by electroporation rather than bybacteriophage packaging; see, e.g., Ioannou et al. (1994) NatureGenetics 6:84-89; Pierce et al. (1992) Meth. Enzymol. 216:549-574;Pierce et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:2056-2060; U.S.Pat. No. 5,300,431 and International PCT application No. WO 92/14819)vectors, it is possible for as few as 50 clones to represent the entiremicro-megachromosome.

[0148] a. Centromeres

[0149] An exemplary centromere for use in the construction of amammalian artificial chromosome is that contained within themegachromosome of any of the megachromosome-containing cell linesprovided herein, such as, for example, H1D3 and derivatives thereof,such as mM2C1 cells. Megachromosomes are isolated from such cell linesutilizing, for example, the procedures described herein, and thecentromeric sequence is extracted from the isolated megachromosomes. Forexample, the megachromosomes may be separated into fragments utilizingselected restriction endonucleases that recognize and cut at sites that,for instance, are primarily located in the replication and/orheterologous DNA integration sites and/or in the satellite DNA. Based onthe sizes of the resulting fragments, certain undesired elements may beseparated from the centromere-containing sequences. Thecentromere-containing DNA, which could be as large as 1 Mb.

[0150] Probes that specifically recognize the centromeric sequences,such as mouse minor satellite DNA-based probes [see, e.g., Wong et al.(1988) Nucl. Acids Res. 16:11645-11661], may be used to isolate thecentromere-containing YAC, BAC or PAC clones derived from themegachromosome. Alternatively, or in conjunction with the directidentification of centromere-containing megachromosomal DNA, probes thatspecifically recognize the non-centromeric elements, such as probesspecific for mouse major satellite DNA, the heterologous DNA and/orrDNA, may be used to identify and eliminate the non-centromericDNA-containing clones.

[0151] Additionally, centromere cloning methods described herein may beutilized to isolate the centromere-containing sequence of themegachromosome. For example, Example 12 describes the use of YAC vectorsin combination with the murine tyrosinase gene and NMRI/Han mice foridentification of the centromeric sequence.

[0152] Once the centromere fragment has been isolated, it may besequenced and the sequence information may in turn be used in PCRamplification of centromere sequences from megachromosomes or othersources of centromeres. Isolated centromeres may also be tested forfunction in vivo by transferring the DNA into a host mammalian cell.Functional analysis may include, for example, examining the ability ofthe centromere sequence to bind centromere-binding proteins. The clonedcentromere will be transferred to mammalian cells with a selectablemarker gene and the binding of a centromere-specific protein, such asanti-centromere antibodies (e.g., LU851, see, Hadlaczky et al. (1986)Exp. Cell Res. 167:1-15) can be used to assess function of thecentromeres.

[0153] b. Telomeres

[0154] Preferred telomeres are the 1 kB synthetic telomere providedherein (see, Examples). A double synthetic telomere construct, whichcontains a 1 kB synthetic telomere linked to a dominant selectablemarker gene that continues in an inverted orientation may be used forease of manipulation. Such a double construct contains a series ofTTAGGG repeats 3′ of the marker gene and a series of repeats of theinverted sequence, i.e., GGGATT, 5′ of the marker gene as follows:(GGGATTT)_(n)—dominant marker gene—(TTAGGG)_(n). Using an invertedmarker provides an easy means for insertion, such as by blunt endligation, since only properly oriented fragments will be selected.

[0155] C. Megareplicator

[0156] The megareplicator sequences, such as, the rDNA, provided hereinare preferred for use in in vitro constructs. The rDNA provides anorigin of replication and also provides sequences that facilitateamplification of the artificial chromosome in vivo to increase the sizeof the chromosome to, for example accommodate increasing copies of aheterologous gene of interest as well as continuous high levels ofexpression of the heterologous genes.

[0157] d. Filler heterochromatin

[0158] Filler heterochromatin, particularly satellite DNA, is includedto maintain structural integrity and stability of the artificialchromosome and provide a structural base for carrying genes within thechromosome. The satellite DNA is typically A/T-rich DNA sequence, suchas mouse major satellite DNA, or G/C-rich DNA sequence, such as hamsternatural satellite DNA. Sources of such DNA include any eukaryoticorganisms that carry non-coding satellite DNA with sufficient A/T or G/Ccomposition to promote ready separation by sequence, such as by FACS, orby density gradients. The satellite DNA may also be synthesized bygenerating sequence containing monotone, tandem repeats of highly A/T-or G/C-rich DNA units.

[0159] The most suitable amount of filler heterochromatin for use inconstruction of the artificial chromosome may be empirically determinedby, for example, including segments of various lengths, increasing insize, in the construction process. Fragments that are too small to besuitable for use will not provide for a functional chromosome, which maybe evaluated in cell-based expression studies, or will result in achromosome of limited functional lifetime or mitotic and structuralstability.

[0160] e. Selectable marker

[0161] Any convenient selectable marker may be used and at anyconvenient locus in the MAC.

[0162] 2. Combination of the isolated chromosomal elements

[0163] Once the isolated elements are obtained, they may be combined togenerate the complete, functional artificial chromosome. This assemblycan be accomplished for example, by in vitro ligation either insolution, LMP agarose or on microbeads. The ligation is conducted sothat one end of the centromere is directly joined to a telomere. Theother end of the centromere, which serves as the gene-carryingchromosome arm, is built up from a combination of satellite DNA and rDNAsequence and may also contain a selectable marker gene. Another telomereis joined to the end of the gene-carrying chromosome arm. Thegene-carrying arm is the site at which any heterologous genes ofinterest, for example, in expression of desired proteins encodedthereby, are incorporated either during in vitro construction of thechromosome or sometime thereafter.

[0164] 3. Analysis and testing of the artificial chromosome

[0165] Artificial chromosomes constructed in vitro may be tested forfunctionality in in vivo mammalian cell systems, using any of themethods described herein for the SATACs, minichromosomes, or known tothose of skill in the art.

[0166] 4. Introduction of desired heterologous DNA into the in vitrosynthesized chromosome

[0167] Heterologous DNA may be introduced into the in vitro synthesizedchromosome using routine methods of molecular biology, may be introducedusing the methods described herein for the SATACs, or may beincorporated into the in vitro synthesized chromosome as part of one ofthe synthetic elements, such as the heterochromatin. The heterologousDNA may be linked to a selected repeated fragment, and then theresulting construct may be amplified In vitro using the methods for suchin vitro amplification provided herein (see the Examples).

[0168] D. Introduction of artificial chromosomes into cells, tissues,animals and plants

[0169] Suitable hosts for introduction of the MACs provided herein,include, but are not limited to, any animal or plant, cell or tissuethereof, including, but not limited to: mammals, birds, reptiles,amphibians, insects, fish, arachnids, tobacco, tomato, wheat, plants andalgae. The MACs, if contained in cells, may be introduced by cell fusionor microcell fusion or, if the MACs have been isolated from cells, theymay be introduced into host cells by any method known to those of skillin this art, including but not limited to: direct DNA transfer,electroporation, lipid-mediated transfer, e.g., lipofection andliposomes, microprojectile bombardment, microinjection in cells andembryos, protoplast regeneration for plants, and any other suitablemethod [see, e.g., Weissbach et al. (1988) Methods for Plant MolecularBiology, Academic Press, N.Y., Section VIII, pp. 421-463; Grierson etal. (1988) Plant Molecular Biology, 2d Ed., Blackie, London, Ch. 7-9;see, also U.S. Pat. Nos. 5,491,075; 5,482,928; and 5,424,409; see, also,e.g., U.S. Pat. No. 5,470,708, which describes particle-mediatedtransformation of mammalian unattached cells].

[0170] Other methods for introducing DNA into cells include nuclearmicroinjection and bacterial protoplast fusion with intact cells.Polycations, such as polybrene and polyornithine, may also be used. Forvarious techniques for transforming mammalian cells, see e.g., Keown etal. Methods in Enzymology (1990) Vol. 185, pp. 527-537; and Mansour etal. (1988) Nature 336:348-352.

[0171] For example, isolated, purified artificial chromosomes can beinjected into an embryonic cell line such as a human kidney primaryembryonic cell line [ATCC accession number CRL 1573] or embryonic stemcells [see, e.g., Hogan et al. (1994) Manipulating the Mouse Embryo,A:Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y., see, especially, pages 255-264 and Appendix 3].

[0172] Preferably the chromosomes are introduced by microinjection,using a system such as the Eppendorf automated microinjection system,and grown under selective conditions, such as in the presence ofhygromycin B or neomycin.

[0173] 1. Methods for introduction of chromosomes into hosts

[0174] Depending on the host cell used, transformation is done usingstandard techniques appropriate to such cells. These methods includeany, including those described herein, known to those of skill in theart.

[0175] a. DNA uptake

[0176] For mammalian cells that do not have cell walls, the calciumphosphate precipitation method for introduction of exogenous DNA [see,e.g., Graham et al. (1978) Virology 52:456-457; Wigler et al. (1979)Proc. Natl. Acad. Sci. U.S.A. 76:1373-1376; and Current Protocols inMolecular Biology, Vol. 1, Wiley Inter-Science, Supplement 14, Unit9.1.1-9.1.9 (1990)] is often preferred. DNA uptake can be accomplishedby DNA alone or in the presence of polyethylene glycol [PEG-mediatedgene transfer], which is a fusion agent, or by any variations of suchmethods known to those of skill in the art [see, e.g., U.S. Pat. No.4,684,611].

[0177] Lipid-mediated carrier systems are also among the preferredmethods for introduction of DNA into cells [see, e.g., Teifel et al.(1995) Biotechniques 19:79-80; Albrecht et al. (1996) Ann. Hematol.72:73-79; Holmen et al. (1995) In Vitro Cell Dev. Biol. Anim.31:347-351; Remy et al. (1994) Bioconjug. Chem. 5:647-654; Le Bolc'h etal. (1995) Tetrahedron Lett. 36:6681-6684; Loeffler et al. (1993) Meth.Enzymol. 217:599-618]. Lipofection [see, e.g., Strauss (1996) Meth. Mol.Biol. 54:307-327] may also be used to introduce DNA into cells. Thismethod is particularly well-suited for transfer of exogenous DNA intochicken cells (e.g., chicken blastodermal cells and primary chickenfibroblasts; see Brazolot et al. (1991) Mol. Repro. Dev. 30:304-312). Inparticular, DNA of interest can be introduced into chickens in operativelinkage with promoters from genes, such as lysozyme and ovalbumin, thatare expressed in the egg, thereby permitting expression of theheterologous DNA in the egg.

[0178] Additional methods useful in the direct transfer of DNA intocells include particle gun electrofusion [see, e.g., U.S. Pat. Nos.4,955,378, 4,923,814, 4,476,004, 4,906,576 and 4,441,972] andvirion-mediated gene transfer.

[0179] A commonly used approach for gene transfer in land plantsinvolves the direct introduction of purified DNA into protoplasts. Thethree basic methods for direct gene transfer into plant cellsinclude: 1) polyethylene glycol [PEG]-mediated DNA uptake, 2)electroporation-mediated DNA uptake and 3) microinjection. In addition,plants may be transformed using ultrasound treatment [see, e.g.,International PCT application publication No. WO 91/00358].

[0180] b. Electroporation

[0181] Electroporation involves providing high-voltage electrical pulsesto a solution containing a mixture of protoplasts and foreign DNA tocreate reversible pores in the membranes of plant protoplasts as well asother cells. Electroporation is generally used for prokaryotes or othercells, such as plants that contain substantial cell-wall barriers.Methods for effecting electroporation are well known [see, e., U.S. Pat.Nos. 4,784,737, 5,501,967, 5,501,662, 5,019,034, 5,503,999; see, alsoFrommet al. (1985) Proc. Natl. Acad. Sci. U.S.A. 82:5824-5828].

[0182] For example, electroporation is often used for transformation ofplants [see, e.g., Ag Biotechnology News 7:3 and 17 (September/October1990)]. In this technique, plant protoplasts are electroporated in thepresence of the DNA of interest that also includes a phenotypic marker.Electrical impulses of high field strength reversibly permeabilizebiomembranes allowing the introduction of the plasmids. Electroporatedplant protoplasts reform the cell wall, divide, and form plant callus.Transformed plant cells will be identified by virtue of the expressedphenotypic marker. The exogenous DNA may be added to the protoplasts inany form such as, for example, naked linear, circular or supercoiledDNA, DNA encapsulated in liposomes, DNA in spheroplasts, DNA in otherplant protoplasts, DNA complexed with salts, and other methods.

[0183] c. Microcells

[0184] The chromosomes can be transferred by preparing microcellscontaining an artificial chromosome and then fusing with selected targetcells. Methods for such preparation and fusion of microcells are wellknown [see the Examples and also see, e., U.S. Pat. Nos. 5,240,840,4,806,476, 5,298,429, 5,396,767, Fournier (1981) Proc. Natl. Acad. Sci.U.S.A. 78:6349-6353; and Lambert et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:5907-59]. Microcell fusion, using microcells that contain anartificial chromosome, is a particularly useful method for introductionof MACs into avian cells, such as DT40 chicken pre-B cells [for adescription of DT40 cell fusion, see, e.g., Dieken et al. (1996) NatureGenet. 12:174-182].

[0185] 2. Hosts

[0186] Suitable hosts include any host known to be useful forintroduction and expression of heterologous DNA. Of particular interestherein, animal and plant cells and tissues, including, but not limitedto insect cells and larvae, plants, and animals, particularly transgenic(non-human) animals, and animal cells. Other hosts include, but are notlimited to mammals, birds, particularly fowl such as chickens, reptiles,amphibians, insects, fish, arachnids, tobacco, tomato, wheat, monocots,dicots and algae, and any host into which introduction of heterologousDNA is desired. Such introduction can be effected using the MACsprovided herein, or, if necessary by using the MACs provided herein toidentify species-specific centromeres and/or functional chromosomalunits and then using the resulting centromeres or chromosomal units asartificial chromosomes, or alternatively, using the methods exemplifiedherein for production of MACs to produce species-specific artificialchromosomes.

[0187] a. Introduction of DNA into embryos for production of transgenic(non-human) animals and introduction of DNA into animal cells

[0188] Transgenic (non-human) animals can be produced by introducingexogenous genetic material into a pronucleus of a mammalian zygote bymicroinjection [see, e., U.S. Pat. Nos. 4,873,191 and 5,354,674; see,also, International PCT application publication No. WO 95/14769, whichis based on U.S. application Ser. No. 08/159,084]. The zygote is capableof development into a mammal. The embryo or zygote is transplanted intoa host female uterus and allowed to develop. Detailed protocols andexamples are set forth below.

[0189] Nuclear transfer [see, Wilmut et al. (1997) Nature 385:810-813,International PCT application Nos. WO 97/07669 and WO 97/07668]. Brieflyin this method, the SATAC containing the genes of interest is introducedby any suitable method, into an appropriate donor cell, such as amammary gland cell, that contains totipotent nuclei. The diploid nucleusof the cell, which is either in G0 or G1 phase, is then introduced, suchas by cell fusion or microinjection, into an unactivated oöcyte,preferably enucleated cell, which is arrested in the metaphase of thesecond meiotic division. Enucleation may be effected by any suitablemethod, such as actual removal, or by treating with means, such asultraviolet light, that functionally remove the nucleus. The oöcyte isthen activated, preferably after a period of contact, about 6-20 hoursfor cattle, of the new nucleus with the cytoplasm, while maintainingcorrect ploidy, to produce a reconstituted embryo, which is thenintroduced into a host. Ploidy is maintained during activation, forexample, by incubating the reconstituted cell in the presence of amicrotubule inhibitor, such as nocodazole, colchicine, cocemid, andtaxol, whereby the DNA replicates once.

[0190] Transgenic chickens can be produced by injection of dispersedblastodermal cells from Stage X chicken embryos into recipient embryosat a similar stage of development [see e.g., Etches et al. (1993)Poultry Sci. 72:882-889; Petitte et al. (1990) Development 108:185-189].Heterologous DNA is first introduced into the donor blastodermal cellsusing methods such as, for example, lipofection [see, e.g., Brazolot etal. (1991) Mol. Repro. Dev. 30:304-312] or microcell fusion [see, e.g.,Dieken et al. (1996) Nature Genet. 12:174-182]. The transfected donorcells are then injected into recipient chicken embryos [see e.g.,Carsience et al. (1993) Development 117: 669-675]. The recipient chickenembryos within the shell are candled and allowed to hatch to yield agermline chimeric chicken.

[0191] DNA can be introduced into animal cells using any knownprocedure, including, but not limited to: direct uptake, incubation withpolyethylene glycol [PEG], microinjection, electroporation, lipofection,cell fusion, microcell fusion, particle bombardment, includingmicroprojectile bombardment [see, e., U.S. Pat. No. 5,470,708, whichprovides a method for transforming unattached mammalian cells viaparticle bombardment], and any other such method. For example, thetransfer of plasmid DNA in liposomes directly to human cells in situ hasbeen approved by the FDA for use in humans [see, e.g., Nabel, et al.(1990) Science 249:1285-1288 and U.S. Pat. No. 5,461,032].

[0192] b. Introduction of heterologous DNA into plants

[0193] Numerous methods for producing or developing transgenic plantsare available to those of skill in the art. The method used is primarilya function of the species of plant. These methods include, but are notlimited to: direct transfer of DNA by processes, such as PEG-induced DNAuptake, protoplast fusion, microinjection, electroporation, andmicroprojectile bombardment [see, e.g., Uchimiya et al. (1989) J. ofBiotech. 12: 1-20 for a review of such procedures, see, also, e.g., U.S.Pat. Nos. 5,436,392 and 5,489,520 and many others]. For purposes herein,when introducing a MAC, microinjection, protoplast fusion and particlegun bombardment are preferred.

[0194] Plant species, including tobacco, rice, maize, rye, soybean,Brassica napus, cotton, lettuce, potato and tomato, have been used toproduce transgenic plants. Tobacco and other species, such as petunias,often serve as experimental models in which the methods have beendeveloped and the genes first introduced and expressed.

[0195] DNA uptake can be accomplished by DNA alone or in the presence ofPEG, which is a fusion agent, with plant protoplasts or by anyvariations of such methods known to those of skill in the art [see,e.g., U.S. Pat. No. 4,684,611 to Schilperoot et al.]. Electroporation,which involves high-voltage electrical pulses to a solution containing amixture of protoplasts and foreign DNA to create reversible pores, hasbeen used, for example, to successfully introduce foreign genes intorice and Brassica napus. Microinjection of DNA into plant cells,including cultured cells and cells in intact plant organs and embryoidsin tissue culture and microprojectile bombardment [acceleration of smallhigh density particles, which contain the DNA, to high velocity with aparticle gun apparatus, which forces the particles to penetrate plantcell walls and membranes] have also been used. All plant cells intowhich DNA can be introduced and that can be regenerated from thetransformed cells can be used to produce transformed whole plants whichcontain the transferred artificial chromosome. The particular protocoland means for introduction of the DNA into the plant host may need to beadapted or refined to suit the particular plant species or cultivar.

[0196] C. Insect cells

[0197] Insects are useful hosts for introduction of artificialchromosomes for numerous reasons, including, but not limited to: (a)amplification of genes encoding useful proteins can be accomplished inthe artificial chromosome to obtain higher protein yields in insectcells; (b) insect cells support required post-translationalmodifications, such as glycosylation and phosphorylation, that can berequired for protein biological functioning; (c) insect cells do notsupport mammalian viruses, and, thus, eliminate the problem ofcross-contamination of products with such infectious agents; (d) thistechnology circumvents traditional recombinant baculovirus systems forproduction of nutritional, industrial or medicinal proteins in insectcell systems; (e) the low temperature optimum for insect cell growth(28° C.) permits reduced energy cost of production; (f) serum-freegrowth medium for insect cells permits lower production costs; (g)artificial chromosome-containing cells can be stored indefinitely at lowtemperature; and (h) insect larvae will be biological factories forproduction of nutritional, medicinal or industrial proteins bymicroinjection of fertilized insect eggs [see, e.g., Joy et al. (1991)Current Science 66:145-150, which provides a method for microinjectingheterologous DNA into Bombyx mori eggs].

[0198] Either MACs or insect-specific artificial chromosomes [BUGACs]will be used to introduce genes into insects. As described in theExamples, it appears that MACs will function in insects to directexpression of heterologous DNA contained thereon. For example, asdescribed in the Examples, a MAC containing the B. mori actin genepromoter fused to the lacZ gene has been generated by transfection ofEC3/7C5 cells with a plasmid containing the fusion gene. Subsequentfusion of the B. mori cells with the transfected EC3/7C5 cells thatsurvived selection yielded a MAC-containing insect-mouse hybrid cellline in which β-galactosidase expression was detectable.

[0199] Insect host cells include, but are not limited to, hosts such asSpodoptera frugiperda [caterpillar], Aedes aegypti [mosquito], Aedesalbopictus [mosquito], Drosophila melanogaster [fruitfly], Bombyx mori[silkworm], Manduca sexta [tomato horn worm] and Trichoplusia ni[cabbage looper]. Efforts have been directed toward propagation ofinsect cells in culture. Such efforts have focused on the fall armyworm,Spodoptera frugiperda. Cell lines have been developed also from otherinsects such as the cabbage looper, Trichoplusia ni and the silkworm,Bombyx mori. It has also been suggested that analogous cell lines can becreated using the tomato hornworm, Manduca sexta. To introduce DNA intoan insect, it should be introduced into the larvae, and allowed toproliferate, and then the hemolymph recovered from the larvae so thatthe proteins can be isolated therefrom.

[0200] The preferred method herein for introduction of artificialchromosomes into insect cells is microinjection [see, e.g., Tamura etal. (1991) Bio Ind. 8:26-31; Nikolaev et al. (1989) Mol. Biol. (Moscow)23:1177-87; and methods exemplified and discussed herein].

[0201] E. Applications for and Uses of Artificial chromosomes

[0202] Artificial chromosomes provide convenient and useful vectors, andin some instances [e.g., in the case of very large heterologous genes]the only vectors, for introduction of heterologous genes into hosts.Virtually any gene of interest is amenable to introduction into a hostvia artificial chromosomes. Such genes include, but are not limited to,genes that encode receptors, cytokines, enzymes, proteases, hormones,growth factors, antibodies, tumor suppressor genes, therapeutic productsand multigene pathways.

[0203] The artificial chromosomes provided herein will be used inmethods of protein and gene product production, particularly usinginsects as host cells for production of such products, and in cellular(e.g., mammalian cell) production systems in which the artificialchromomsomes (particularly MACs) provide a reliable, stable andefficient means for optimizing the biomanufacturing of importantcompounds for medicine and industry. They are also intended for use inmethods of gene therapy, and for production of transgenic plants andanimals [discussed above, below and in the EXAMPLES].

[0204] 1. Gene Therapy

[0205] Any nucleic acid encoding a therapeutic gene product or productof a multigene pathway may be introduced into a host animal, such as ahuman, or into a target cell line for introduction into an animal, fortherapeutic purposes. Such therapeutic purposes include, genetic therapyto cure or to provide gene products that are missing or defective, todeliver agents, such as anti-tumor agents, to targeted cells or to ananimal, and to provide gene products that will confer resistance orreduce susceptibility to a pathogen or ameliorate symptoms of a diseaseor disorder. The following are some exemplary genes and gene products.Such exemplification is not intended to be limiting.

[0206] a. Anti-HIV ribozymes

[0207] As exemplified below, DNA encoding anti-HIV ribozymes can beintroduced and expressed in cells using MACs, including theeuchromatin-based minichromosomes and the SATACs. These MACs can be usedto make a transgenic mouse that expresses a ribozyme and, thus, servesas a model for testing the activity of such ribozymes or from whichribozyme-producing cell lines can be made. Also, introduction of a MACthat encodes an anti-HIV ribozyme into human cells will serve astreatment for HIV infection. Such systems further demonstrate theviability of using any disease-specific ribozyme to treat or amelioratea particular disease.

[0208] b. Tumor Suppressor Genes

[0209] Tumor suppressor genes are genes that, in their wild-typealleles, express proteins that suppress abnormal cellular proliferation.When the gene coding for a tumor suppressor protein is mutated ordeleted, the resulting mutant protein or the complete lack of tumorsuppressor protein expression may result in a failure to correctlyregulate cellular proliferation. Consequently, abnormal cellularproliferation may take place, particularly if there is already existingdamage to the cellular regulatory mechanism. A number of well-studiedhuman tumors and tumor cell lines have been shown to have missing ornonfunctional tumor suppressor genes.

[0210] Examples of tumor suppression genes include, but are not limitedto, the retinoblastoma susceptibility gene or RB gene, the p53 gene, thegene that is deleted in colon carcinoma [i.e., the DCC gene] and theneurofibromatosis type 1 [NF-1] tumor suppressor gene [see, e.g., U.S.Pat. No. 5,496,731; Weinberg et al. (1-991) 254:1138-1146]. Loss offunction or inactivation of tumor suppressor genes may play a centralrole in the initiation and/or progression of a significant number ofhuman cancers.

[0211] The p53 Gene

[0212] Somatic cell mutations of the p53 gene are said to be the mostfrequent of the gene mutations associated with human cancer [see, e.g.,Weinberg et al. (1991) Science 254:1138-1146]. The normal or wild-typep53 gene is a negative regulator of cell growth, which, when damaged,favors cell transformation. The p53 expression product is found in thenucleus, where it may act in parallel or cooperatively with other geneproducts. Tumor cell lines in which p53 has been deleted have beensuccessfully treated with wild-type p53 vector to reduce tumorigenicity[see, Baker et al. (1990) Science 249:912-915].

[0213] DNA encoding the p53 gene and plasmids containing this DNA arewell known [see, e.g., U.S. Pat. No. 5,260,191; see, also Chen et al.(1990) Science 250:1576; Farrel et al. (1991) EMBO J. 10:2879-2887;plasmids containing the gene are available from the ATCC, and thesequence is in the GenBank Database, accession nos. X54156, X60020,M14695, M16494, K03199].

[0214] c. The CFTR gene

[0215] Cystic fibrosis [CF] is an autosomal recessive disease thataffects epithelia of the airways, sweat glands, pancreas, and otherorgans. It is a lethal genetic disease associated with a defect inchloride ion transport, and is caused by mutations in the gene codingfor the cystic fibrosis transmembrane conductance regulator [CFTR], a1480 amino acid protein that has been associated with the expression ofchloride conductance in a variety of eukaryotic cell types. Defects inCFTR destroy or reduce the ability of epithelial cells in the airways,sweat glands, pancreas and other tissues to transport chloride ions inresponse to cAMP-mediated agonists and impair activation of apicalmembrane channels by cAMP-dependent protein kinase A [PKA]. Given thehigh incidence and devastating nature of this disease, development ofeffective CF treatments is imperative.

[0216] The CFTR gene [˜250 kb] can be transferred into a MAC for use,for example, in gene therapy as follows. A CF-YAC [see Green et al.Science 250:94-98] may be modified to include a selectable marker, suchas a gene encoding a protein that confers resistance to puromycin orhygromycin, and λ-DNA for use in site-specific integration into aneo-minichromosome or a SATAC. Such a modified CF-YAC can be introducedinto MAC-containing cells, such as EC3/7C5 or 19C5xHa4 cells, by fusionwith yeast protoplasts harboring the modified CF-YAC or microinjectionof yeast nuclei harboring the modified CF-YAC into the cells. Stabletransformants are then selected on the basis of antibiotic resistance.These transformants will carry the modified CF-YAC within the MACcontained in the cells.

[0217] 2. Animals, birds, fish and plants that are genetically alteredto possess desired traits such as resistance to disease

[0218] Artificial chromosomes are ideally suited for preparing animals,including vertebrates and invertebrates, including birds and fish aswell as mammals, that possess certain desired traits, such as, forexample, disease resistance, resistance to harsh environmentalconditions, altered growth patterns, and enhanced physicalcharacteristics.

[0219] One example of the use of artificial chromosomes in generatingdisease-resistant organisms involves the preparation of multivalentvaccines. Such vaccines include genes encoding multiple antigens thatcan be carried in a MAC, or species-specific artificial chromosome, andeither delivered to a host to induce immunity, or incorporated intoembryos to produce transgenic (non-human) animals and plants that areimmune or less susceptible to certain diseases.

[0220] Disease-resistant animals and plants may also be prepared inwhich resistance or decreased susceptibility to disease is conferred byintroduction into the host organism or embryo of artificial chromosomescontaining DNA encoding gene products (e.g., ribozymes and proteins thatare toxic to certain pathogens) that destroy or attenuate pathogens orlimit access of pathogens to the host.

[0221] Animals and plants possessing desired traits that might, forexample, enhance utility, processibility and commercial value of theorganisms in areas such as the agricultural and ornamental plantindustries may also be generated using artificial chromosomes in thesame manner as described above for production of disease-resistantanimals and plants. In such instances, the artificial chromosomes thatare introduced into the organism or embryo contain DNA encoding geneproducts that serve to confer the desired trait in the organism.

[0222] Birds, particularly fowl such as chickens, fish and crustaceanswill serve as model hosts for production of genetically alteredorganisms using artificial chromosomes.

[0223] 3. Use of MACs and other artificial chromosomes for preparationand screening of libraries

[0224] Since large fragments of DNA can be incorporated into eachartificial chromosome, the chromosomes are well-suited for use ascloning vehicles that can accommodate entire genomes in the preparationof genomic DNA libraries, which then can be readily screened. Forexample, MACs may be used to prepare a genomic DNA library useful in theidentification and isolation of functional centromeric DNA fromdifferent species of organisms. In such applications, the MAC used toprepare a genomic DNA library from a particular organism is one that isnot functional in cells of that organism. That is, the MAC does notstably replicate, segregate or provide for expression of genes containedwithin it in cells of the organism. Preferably, the MACs contain anindicator gene (e.g., the lacZ gene encoding β-galactosidase or genesencoding products that confer resistance to antibiotics such asneomycin, puromycin, hygromycin) linked to a promoter that is capable ofpromoting transcription of the indicator gene in cells of the organism.Fragments of genomic DNA from the organism are incorporated into theMACs, and the MACs are transferred to cells from the organism. Cellsthat contain MACs that have incorporated functional centromerescontained within the genomic DNA fragments are identified by detectionof expression of the marker gene.

[0225] 4. Use of MACs and other artificial chromosomes for stable,high-level protein production

[0226] Cells containing the MACs and/or other artificial chromosomesprovided herein are advantageously used for production of proteins,particularly several proteins from one cell line, such as multipleproteins involved in a biochemical pathway or multivalent vaccines. Thegenes encoding the proteins are introduced into the artificialchromosomes which are then introduced into cells. Alternatively, theheterologous gene(s) of interest are transferred into a production cellline that already contains artificial chromosomes in a manner thattargets the gene(s) to the artificial chromosomes. The cells arecultured under conditions whereby the heterologous proteins areexpressed. Because the proteins will be expressed at high levels in astable permanent extra-genomic chromosomal system, selective conditionsare not required.

[0227] Any transfectable cells capable of serving as recombinant hostsadaptable to continuous propagation in a cell culture system [see, e.g.,McLean (1993) Trends In Biotech. 11:232-238] are suitable for use in anartificial chromosome-based protein production system. Exemplary hostcell lines include, but are not limited to, the following: Chinesehamster ovary (CHO) cells [see, e.g., Zang et al. (1995) Biotechnology13:389-392], HEK 293, Ltk⁻, COS-7, DG44, and BHK cells. CHO cells areparticularly preferred host cells. Selection of host cell lines for usein artificial chromosome-based protein production systems is within theskill of the art, but often will depend on a variety of factors,including the properties of the heterologous protein to be produced,potential toxicity of the protein in the host cell, any requirements forpost-translational modification (e.g., glycosylation, amination,phosphorylation) of the protein, transcription factors available in thecells, the type of promoter element(s) being used to drive expression ofthe heterologous gene, whether production will be completelyintracellular or the heterologous protein will preferably be secretedfrom the cell, and the types of processing enzymes in the cell.

[0228] The artificial chromosome-based system for heterologous proteinproduction has many advantageous features. For example, as describedabove, because the heterologous DNA is located in an independent,extra-genomic artificial chromosome (as opposed to randomly inserted inan unknown area of the host cell genome or located as extrachromosomalelement(s) providing only transient expression) it is stably maintainedin an active transcription unit and is not subject to ejection viarecombination or elimination during cell division. Accordingly, it isunnecessary to include a selection gene in the host cells and thusgrowth under selective conditions is also unnecessary. Furthermore,because the artificial chromosomes are capable of incorporating largesegments of DNA, multiple copies of the heterologous gene and linkedpromoter element(s) can be retained in the chromosomes, therebyproviding for high-level expression of the foreign protein(s).Alternatively, multiple copies of the gene can be linked to a singlepromoter element and several different genes may be linked in a fusedpolygene complex to a single promoter for expression of, for example,all the key proteins constituting a complete metabolic pathway [see,e.g., Beck von Bodman et al. (1995) Biotechnology 13:587-591].Alternatively, multiple copies of a single gene can be operativelylinked to a single promoter, or each or one or several copies may belinked to different promoters or multiple copies of the same promoter.Additionally, because artificial chromosomes have an almost unlimitedcapacity for integration and expression of foreign genes, they can beused not only for the expression of genes encoding end-products ofinterest, but also for the expression of genes associated with optimalmaintenance and metabolic management of the host cell, e.g., genesencoding growth factors, as well as genes that may facilitate rapidsynthesis of correct form of the desired heterologous protein product,e.g., genes encoding processing enzymes and transcription factors. TheMACS are suitable for expression of any proteins or peptides, includingproteins and peptides that require in vivo posttranslationalmodification for their biological activity. Such proteins include, butare not limited to antibody fragments, full-length antibodies, andmultimeric antibodies, tumor suppressor proteins, naturally occurring orartificial antibodies and enzymes, heat shock proteins, and others.

[0229] Thus, such cell-based “protein factories” employing MACs cangenerated using MACs constructed with multiple copies [theoretically anunlimited number or at least up to a number such that the resulting MACis about up to the size of a genomic chromosome (i.e., endogenous)] ofprotein-encoding genes with appropriate promoters, or multiple genesdriven by a single promoter, i.e., a fused gene complex [such as acomplete metabolic pathway in plant expression system; see, e.g., Beckvon Bodman (1995) Biotechnology 13:587-591]. Once such MAC isconstructed, it can be transferred to a suitable cell culture system,such as a CHO cell line in protein-free culture medium [see, e.g.,(1995) Biotechnology 13:389-39] or other immortalized cell lines [see,e.g., (1993) TIBTECH 11:232-238] where continuous production can beestablished.

[0230] The ability of MACs to provide for high-level expression ofheterologous proteins in host cells is demonstrated, for example, byanalysis of the H1D3 and G3D5 cell lines described herein and depositedwith the ECACC. Northern blot analysis of mRNA obtained from these cellsreveals that expression of the hygromycin-resistance and β-galactosidasegenes in the cells correlates with the amplicon number of themegachromosome(s) contained therein.

[0231] F. Methods for the synthesis of DNA sequences containing repeatedDNA units

[0232] Generally, assembly of tandemly repeated DNA poses difficultiessuch as unambiguous annealing of the complementary oligos. For example,separately annealed products may ligate in an inverted orientation.Additionally, tandem or inverted repeats are particularly susceptible torecombination and deletion events that may disrupt the sequence.Selection of appropriate host organisms (e.g., rec⁻ strains) for use inthe cloning steps of the synthesis of sequences of tandemly repeated DNAunits may aid in reduction and elimination of such events.

[0233] Methods are provided herein for the synthesis of extended DNAsequences containing repeated DNA units. These methods are particularlyapplicable to the synthesis of arrays of tandemly repeated DNA units,which are generally difficult or not possible to construct utilizingother known gene assembly strategies. A specific use of these methods isin the synthesis of sequences of any length containing simple (e.g.,ranging from 2-6 nucleotides) tandem repeats (such as telomeres andsatellite DNA repeats and trinucleotide repeats of possible clinicalsignificance) as well as complex repeated DNA sequences. An particularexample of the synthesis of a telomere sequence containing over 150successive repeated hexamers utilizing these methods is provided herein.

[0234] The methods provided herein for synthesis of arrays of tandem DNArepeats are based in a series of extension steps in which successivedoublings of a sequence of repeats results in an exponential expansionof the array of tandem repeats. These methods provide several advantagesover previously known methods of gene assembly. For instance, thestarting oligonucleotides are used only once. The intermediates in, aswell as the final product of, the construction of the DNA arraysdescribed herein may be obtained in cloned form in a microbial organism(eq., E. coli and yeast). Of particular significance, with regard tothese methods is the fact that sequence length increases exponentially,as opposed to linearly, in each extension step of the procedure eventhough only two oligonucleotides are required in the methods. Theconstruction process does not depend on the compatibility of restrictionenzyme recognition sequences and the sequence of the repeated DNAbecause restriction sites are used only temporarily during the assemblyprocedure. No adaptor is necessary, though a region of similar functionis located between two of the restriction sites employed in the process.The only limitation with respect to restriction site use is that the twosites employed in the method must not be present elsewhere in the vectorutilized in any cloning steps. These procedures can also be used toconstruct complex repeats with perfectly identical repeat units, such asthe variable number tandem repeat (VNTR) 3′ of the human apolipoproteinB100 gene (a repeat unit of 30 bp, 100% AT) or alphoid satellite DNA.

[0235] The method of synthesizing DNA sequences containing tandemrepeats may generally be described as follows.

[0236] 1. Starting materials

[0237] Two oligonucleotides are utilized as starting materials.Oligonucleotide 1 is of length k of repeated sequence (the flanks ofwhich are not relevant) and contains a relatively short stretch (60-90nucleotides) of the repeated sequence, flanked with appropriately chosenrestriction sites:

5′-S1>>>>>>>>>>>>>>>>>>>>>>>>>>>S2-3′

[0238] wherein S1 is restriction site 1 cleaved by E1 [preferably anenzyme producing a 3′-overhang (e.g., PacI, PstI, SphI, NsiI, etc.) orblunt-end], S2 is a second restriction site cleaved by E2 (preferably anenzyme producing a 3′-overhang or one that cleaves outside therecognition sequence, such as TspRI), >represents a simple repeat unit,and ‘_’ denotes a short (8-10) nucleotide flanking sequencecomplementary to oligonucleotide 2:

3′- S3-5′

[0239] wherein S3 is a third restriction site for enzyme E3 and which ispresent in the vector to be used during the construction.

[0240] Because there is a large variety of restriction enzymes thatrecognize many different DNA sequences as cleavage sites, it shouldalways be possible to select sites and enzymes (preferably those thatyield a 3′-protruding end) suitable for these methods in connection withthe synthesis of any one particular repeat array. In most cases, only 1(or perhaps 2) nucleotide(s) has of a restriction site is required to bepresent in the repeat sequence, and the remaining nucleotides of therestriction site can be removed, for example: Pacl: TTAAT/TAA--(Klenow/dNTP) TAA-- Pstl: CTGCA/G-- (Klenow/dNTP) G-- Nsil: ATGCA/T--(Klenow/dNTP) T-- Kpnl: GGTAC/C-- (Klenow/dNTP) C--

[0241] Though there is no known restriction enzyme leaving a single Abehind, this problem can be solved with enzymes leaving behind none atall, for example: Tail: ACGT/ (Klenow/dNTP) -- Nlalll: CATG/(Klenow/dNTP) --

[0242] Additionally, if mung bean nuclease is used instead of Klenow,then the following:

[0243] Xbal: T/CTAGA Mung bean nuclease A—

[0244] Furthermore, there are a number of restriction enzymes that cutoutside of the recognition sequence, and in this case, there is nolimitation at all: TspRl NNCAGTGNN/-- (Klenow/dNTP) -- Bsml GAATG CN/--(Klenow/dNTP) -- CTTAC/GN -- (Klenow/dNTP) --

[0245] 2. Step 1—Annealing

[0246] Oligonucleotides 1 and 2 are annealed at a temperature selecteddepending on the length of overlap (typically in the range of 30-65°C.).

[0247] 3. Step 2—Generating a double-stranded molecule

[0248] The annealed oligonucleotides are filled-in with Klenowpolymerase in the presence of dNTP to produce a double-stranded (ds)sequence: 5′-S1>>>>>>>>>>>>>>>>>>>>>>S2____S3-3′3′-S1<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<S2____S3-5′

[0249] 4. Step 3—Incorporation of double-stranded DNA into a vector

[0250] The double-stranded DNA is cleaved with restriction enzymes E1and E3 and subsequently ligated into a vector (e.g., pUC19 or a yeastvector) that has been cleaved with the same enzymes E1 and E3. Theligation product is used to transform competent host cells compatiblewith the vector being used (e.g., when pUC19 is used, bacterial cellssuch as E. coli DH5α are suitable hosts) which are then plated ontoselection plates. Recombinants can be identified either by color (e.g.,by X-gal staining for β-galactosidase expression) or by colonyhybridization using ³²P-labeled oligonucleotide 2 (detection byhybridization to oligonucleotide 2 is preferred because its sequence isremoved in each of the subsequent extension steps and thus is presentonly in recombinants that contain DNA that has undergone successfulextension of the repeated sequence).

[0251] 5. Step 4—Isolation of insert from the plasmid

[0252] An aliquot of the recombinant plasmid containing k nucleotides ofthe repeat sequence is digested with restriction enzymes E1 and E3, andthe insert is isolated on a gel (native polyacrylamide while the insertis short, but agarose can be used for isolation of longer inserts insubsequent steps). A second aliquot of the recombinant plasmid is cutwith enzymes E2 (treated with Klenow and dNTP to remove the 3′-overhang)and E3, and the large fragment (plasmid DNA plus the insert) isisolated.

[0253] 6. Step 5—Extension of the DNA sequence of k repeats

[0254] The two DNAs (the S1-S3 insert fragment and the vector plusinsert) are ligated, plated to selective plates, and screened forextended recombinants as in Step 3. Now the length of the repeatsequence between restriction sites is twice that of the repeat sequencein the previous step, i.e., 2×k.

[0255] 7. Step 6—Extension of the DNA sequence of 2×k repeats

[0256] Steps 4 and 5 are repeated as many times as needed to achieve thedesired repeat sequence size. In each extension cycle, the repeatsequence size doubles, i.e., if m is the number of extension cycles, thesize of the repeat sequence will be k×2^(m) nucleotides.

[0257] The following examples are included for illustrative purposesonly and are not intended to limit the scope of the invention.

EXAMPLE 1

[0258] General Materials and Methods

[0259] The following materials and methods are exemplary of methods thatare used in the following Examples and that can be used to prepare celllines containing artificial chromosomes. Other suitable materials andmethods known to those of skill in the art may used. Modifications ofthese materials and methods known to those of skill in the art may alsobe employed.

[0260] A. Culture of cell lines, cell fusion, and transfection of cells

[0261] 1. Chinese hamster K-20 cells and mouse A9 fibroblast cells werecultured in F-12 medium. EC3/7 [see, U.S. Pat. No. 5,288,625, anddeposited at the European Collection of Animal cell Culture (ECACC)under accession no. 90051001; see, also Hadlaczky et al. (1991) Proc.Natl. Acad. Sci. U.S.A. 88:8106-8110 and U.S. application Ser. No.08/375,271] and EC3/7C5 [see, U.S. Pat. No. 5,288,625 and Praznovszky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:11042-11046] mouse celllines, and the KE1-2/4 hybrid cell line were maintained in F-12 mediumcontaining 400 μg/ml G418 [SIGMA, St. Louis, Mo.].

[0262] 2. TF1004G19 and TF1004G-19C5 mouse cells, described below, andthe 19C5xHa4 hybrid, described below, and its sublines were cultured inF-12 medium containing up to 400 μg/ml Hygromycin B [Calbiochem]. LP11cells were maintained in F-12 medium containing 3-15 μg/ml Puromycin[SIGMA, St. Louis, Mo.].

[0263] 3. Cotransfection of EC3/7C5 cells with plasmids [pH 132, pCH 110available from Pharmacia, see, also Hall et al. (1983) J. Mol. Appl.Gen. 2:101-109] and with λ DNA was conducted using the calcium phosphateDNA precipitation method [see, e., Chen et al. (1987) Mol. Cell. Biol.7:2745-2752], using 2-5 μg plasmid DNA and 20 μg λ phage DNA per 5×10⁶recipient cells.

[0264] 4. Cell fusion

[0265] Mouse and hamster cells were fused using polyethylene glycol[Davidson et al. (1976) Som. Cell Genet. 2:165-176]. Hybrid cells wereselected in HAT medium containing 400 μg/ml Hygromycin B.

[0266] Approximately 2×10⁷ recipient and 2×10⁶ donor cells were fusedusing polyethylene glycol [Davidson et al. (1976) Som. Cell Genet.2:165-176]. Hybrids were selected and maintained in F-12/HAT medium[Szybalsky et al. (1962) Natl. Cancer Inst. Monogr. 7:75-89] containing10% FCS and 400 μg/ml G418. The presence of “parental” chromosomes inthe hybrid cell lines was verified by in situ hybridization withspecies-specific probes using biotin-labeled human and hamster genomicDNA, and a mouse long interspersed repetitive DNA [pMCPE1.51].

[0267] 5. Microcell fusion

[0268] Microcell-mediated transfer of artificial chromosomes fromEC3/7C5 cells to recipient cells was done according to Saxon et al.[(1985) Mol. Cell. Biol. 1:140-146] with the modifications of Goodfellowet al. [(1989) Techniques for mammalian genome transfer. In GenomeAnalysis a Practical Approach. K. E. Davies, ed., IRL Press, Oxford,Washington D.C. pp.1-17] and Yamada et al. [(1990) Oncogene5:1141-1147]. Briefly, 5×10⁶ EC3/7C5 cells in a T25 flask were treatedfirst with 0.05 μg/ml colcemid for 48 hr and then with 10 μg/mlcytochalasin B for 30 min. The T25 flasks were centrifuged on edge andthe pelleted microcells were suspended in serum free DME medium. Themicrocells were filtered through first a 5 micron and then a 3 micronpolycarbonate filter, treated with 50 μg/ml of phytohemagglutin, andused for polyethylene glycol mediated fusion with recipient cells.Selection of cells containing the MMCneo was started 48 hours afterfusion in medium containing 400-800 μg/ml G418.

[0269] Microcells were also prepared from 1B3 and GHB42 donor cells asfollows in order to be fused with E2D6K cells [a CHO K-20 cell linecarrying the puromycin N-acetyltransferase gene, i.e., the puromycinresistance gene, under the control of the SV40 early promoter]. Thedonor cells were seeded to achieve 60-75% confluency within 24-36 hours.After that time, the cells were arrested in mitosis by exposure tocolchicine (10 μg/ml) for 12 or 24 hours to induce micronucleation. Topromote micronucleation of GHB42 cells, the cells were exposed tohypotonic treatment (10 min at 37° C.). After colchicine treatment, orafter colchicine and hypotonic treatment, the cells were grown incolchicine-free medium.

[0270] The donor cells were trypsinized and centrifuged and the pelletswere suspended in a 1:1 Percoll medium and incubated for 30-40 min at37° C. After the incubation, 1-3×10⁷ cells (60-70% micronucleationindex) were loaded onto each Percoll gradient (each fusion wasdistributed on 1-2 gradients). The gradients were centrifuged at 19,000rpm for 80 min in a Sorvall SS-34 rotor at 34-37° C. Aftercentrifugation, two visible bands of cells were removed, centrifuged at2000 rpm, 10 min at 4° C., resuspended and filtered through 8 μm poresize nucleopore filters.

[0271] The microcells prepared from the 1B3 and GHB42 cells were fusedwith E2D6K. The E2D6K cells were generated by CaPO₄ transfection of CHOK-20 cells with pCHTV2. Plasmid pCHTV2 contains the puromycin-resistancegene linked to the SV40 promoter and polyadenylation signal, theSaccharomyces cerevisiae URA3 gene, 2.4- and 3.2-kb fragments of aChinese hamster chromosome 2-specific satellite DNA (HC-2 satellite; seeFatyol et al. (1994) Nuc. Acids Res. 22:3728-3736), two copies of thediphtheria toxin-A chain gene (one linked to the herpes simplex virusthymidine kinase (HSV-TK) gene promoter and SV40 polyadenylation signaland the other linked to the HSV-TK promoter without a polyadenylationsignal), the ampicillin-resistance gene and the ColE1 origin ofreplication. Following transfection, puromycin-resistant colonies wereisolated. The presence of the pCHTV2 plasmid in the E2D6K cell line wasconfirmed by nucleic acid amplification of DNA isolated from the cells.

[0272] The purified microcells were centrifuged as described above andresuspended in 2 ml of phytohemagglutinin-P (PHA-P, 100 μg/ml). Themicrocell suspension was then added to a 60-70% confluent recipientculture of E2D6K cells. The preparation was incubated at roomtemperature for 30-40 min to agglutinate the microcells. After the PHA-Pwas removed, the cells were incubated with 1 ml of 50%polyethylene-glycol (PEG) for one min. The PEG was removed and theculture was washed three times with F-12 medium without serum. The cellswere incubated in non-selective medium for 48-60 hours. After this time,the cell culture was trypsinized and plated in F-12 medium containing400 μg/ml hygromycin B and 10 g/ml puromycin to select against theparental cell lines.

[0273] Hybrid clones were isolated from the cells that had been culturedin selective medium. These clones were then analyzed for expression ofβ-galactosidase by the X-gal staining method. Four of five hybrid clonesanalyzed that had been generated by fusion of GHB42 microcells withE2D6K cells yielded positive staining results indicating expression ofβ-galactosidase from the lacZ gene contained in the megachromosomecontributed by the GHB42 cells. Similarly, a hybrid clone that had beengenerated by fusion of 1B3 microcells with E2D6K cells yielded positivestaining results indicating expression of β-galactosidase from the lacZgene contained in the megachromosome contributed by the 1B3 cells. Insitu hybridization analysis of the hybrid clones is also performed toanalyze the mouse chromosome content of the mouse-hamster hybrid cells.

[0274] B. Chromosome banding

[0275] Trypsin G-banding of chromosomes was performed using the methodof Wang & Fedoroff [(1972) Nature 235:52-54], and the detection ofconstitutive heterochromatin with the BSG. C-banding method was doneaccording to Sumner [(1972) Exp. Cell Res. 75:304-306]. For thedetection of chromosome replication by bromodeoxyuridine [BrdU]incorporation, the Fluorescein Plus Giemsa [FPG] staining method ofPerry & Wolff [(1974) Nature 251:156-158] was used.

[0276] C. Immunolabelling of chromosomes and in situ hybridization

[0277] Indirect immunofluorescence labelling with human anti-centromereserum LU851 [Hadlaczky et al. (1986) Exp. Cell Res. 167:1-15], andindirect immunofluorescence and in situ hybridization on the samepreparation were performed as described previously [see, Hadlaczky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-8110, see, also U.S.application Ser. No. 08/375,271]. Immunolabelling withfluorescein-conjugated anti-BrdU monoclonal antibody [Boehringer] wasperformed according to the procedure recommended by the manufacturer,except that for treatment of mouse A9 chromosomes, 2 M hydrochloric acidwas used at 37° C. for 25 min, and for chromosomes of hybrid cells, 1 Mhydrochloric acid was used at 37° C. for 30 min.

[0278] D. Scanning electron microscopy

[0279] Preparation of mitotic chromosomes for scanning electronmicroscopy using osmium impregnation was performed as describedpreviously [Sumner (1991) Chromosoma 100:410-418]. The chromosomes wereobserved with a Hitachi S-800 field emission scanning electronmicroscope operated with an accelerating voltage of 25 kV.

[0280] E. DNA manipulations, plasmids and probes

[0281] 1. General methods

[0282] All general DNA manipulations were performed by standardprocedures [see, e., Sambrook et al. (1989) Molecular cloning: ALaboratory Manual Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.]. The mouse major satellite probe was provided by Dr. J. B.Rattner [University of Calgary, Alberta, Canada]. Cloned mouse satelliteDNA probes [see Wong et al. (1988) Nucl. Acids Res. 16:11645-11661],including the mouse major satellite probe, were gifts from Dr. J. B.Rattner, University of Calgary. Hamster chromosome painting was donewith total hamster genomic DNA, and a cloned repetitive sequencespecific to the centromeric region of chromosome 2 [Fatyol et al. (1994)Nucl. Acids Res. 22:3728-3736] was also used. Mouse chromosome paintingwas done with a cloned long interspersed repetitive sequence [pMCP1.51]specific for the mouse euchromatin.

[0283] For cotransfection and for in situ hybridization, the pCH 110β-galactosidase construct [Pharmacia or Invitrogen], and λcl 857 Sam7phage DNA [New England Biolabs] were used.

[0284] 2. Construction of Plasmid pPuroTel

[0285] Plasmid pPuroTel, which carries a Puromycin-resistance gene and acloned 2.5 kb human telomeric sequence [see SEQ ID No. 3], wasconstructed from the pBabe-puro retroviral vector [Morgenstern et al.(1990) Nucl. Acids Res. 18:3587-3596; provided by Dr. L. Székely(Microbiology and Tumorbiology Center, Karolinska Institutet,Stockholm); see, also Tonghua et al. (1995) Chin. Med. J. (Beijing,Engl. Ed.) 108:653-659; Couto et al. (1994) Infect. Immun. 62:2375-2378;Dunckley et al. (1992) FEBS Lett. 296:128-34; French et al. (1995) Anal.Biochem. 228:354-355; Liu et al. (1995) Blood 85:1095-1103;International PCT application Nos. WO 9520044; WO 9500178, and WO9419456].

[0286] F. Deposited cell lines

[0287] Cell lines KE1-2/4, EC3/7C5, TF1004G19C5, 19C5xHa4, G3D5 and H1D3have been deposited in accord with the Budapest Treaty at the EuropeanCollection of Animal Cell Culture (ECACC) under Accession Nos. 96040924,96040925, 96040926, 96040927, 96040928 and 96040929, respectively. Thecell lines were deposited on Apr. 9, 1996, at the European Collection ofAnimal Cell Cultures (ECACC) Vaccine Research and Production Laboratory,Public Health Laboratory Service, Centre for Applied Microbiology andResearch, Porton Down, Salisbury, Wiltshire SP40JG, United Kingdom. Thedeposits were made in the name of Gyula Hadlaczky of H. 6723, SZEGED,SZAMOS U.1.A. IX. 36. HUNGARY, who has authorized reference to thedeposited cell lines in this application.

EXAMPLE 2

[0288] Preparation of EC3/7, EC3/7C5 and Related Cell Lines

[0289] The EC3/7 cell line is an LMTK⁻ mouse cell line that contains theneo-centromere. The EC3/7C5 cell line is a single-cell subclone of EC3/7that contains the neo-minichromosome.

[0290] A. EC3/7 Cell line

[0291] As described in U.S. Pat. No. 5,288,625 [see, also Praznovszky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:11042-11046 and Hadlaczky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-8110] de novocentromere formation occurs in a transformed mouse LMTK⁻ fibro-blastcell line [EC3/7] after cointegration of λ constructs [λCM8 andλgtWESneo] carrying human and bacterial DNA.

[0292] By cotransfection of a 14 kb human DNA fragment cloned in A[λCM8] and a dominant marker gene [λgtWESneo], a selectable centromerelinked to a dominant marker gene [neo-centromere] was formed in mouseLMTK⁻ cell line EC3/7 [Hadlaczky et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:8106-8110, see FIG. 1]. Integration of the heterologous DNA[the λ DNA and marker gene-encoding DNA] occurred into the short arm ofan acrocentric chromosome [chromosome 7 (see, FIG. 1B)], where anamplification process resulted in the formation of the new centromere[neo-centromere (see FIG. 1C)]. On the dicentric chromosome (FIG. 1C),the newly formed centromere region contains all the heterologous DNA(human, λ, and bacterial) introduced into the cell and an activecentromere.

[0293] Having two functionally active centromeres on the same chromosomecauses regular breakages between the centromeres [see, FIG. 1E]. Thedistance between the two centromeres on the dicentric chromosome isestimated to be ˜10-15 Mb, and the breakage that separates theminichromosome occurred between the two centromeres. Such specificchromosome breakages result in the appearance [in approximately 10% ofthe cells] of a chromosome fragment that carries the neo-centromere[FIG. 1F]. This chromosome fragment is principally composed of human, λ,plasmid, and neomycin-resistance gene DNA, but it also has some mousechromosomal DNA. Cytological evidence suggests that during thestabilization of the MMCneo, there was an inverted duplication of thechromosome fragment bearing the neo-centromere. The size ofminichromosomes in cell lines containing the MMCneo is approximately20-30 Mb; this finding indicates a two-fold increase in size.

[0294] From the EC3/7 cell line, which contains the dicentric chromosome[FIG. 1E], two sublines [EC3/7C5 and EC3/7C6] were selected by repeatedsingle-cell cloning. In these cell lines, the neo-centromere was foundexclusively on a small chromosome [neo-minichromosome], while theformerly dicentric chromosome carried detectable amounts of theexogenously-derived DNA sequences but not an active neo-centromere[FIGS. 1F and 1G].

[0295] The minichromosomes of cell lines EC3/7C5 and EC3/7C6 aresimilar. No differences are detected in their architectures at eitherthe cytological or molecular level. The minichromosomes wereindistinguishable by conventional restriction endonuclease mapping or bylong-range mapping using pulsed field electrophoresis and Southernhybridization. The cytoskeleton of cells of the EC3/7C6 line showed anincreased sensitivity to colchicine, so the EC3/7C5 line was used forfurther detailed analysis.

[0296] B. Preparation of the EC3/7C5 and EC3/7C6 cell lines

[0297] The EC3/7C5 cells, which contain the neo-minichromosome, wereproduced by subcloning the EC3/7 cell line in high concentrations ofG418 [40-fold the lethal dose] for 350 generations. Two singlecell-derived stable cell lines [EC3/7C5 and EC3/7C6] were established.These cell lines carry the neo-centromere on minichromosomes and alsocontain the remaining fragment of the dicentric chromosome. Indirectimmunofluorescence with anti-centromere antibodies and subsequent insitu hybridization experiments demonstrated that the minichromosomesderived from the dicentric chromosome. In EC3/7C5 and EC3/7C6 cell lines(140 and 128 metaphases, respectively) no intact dicentric chromosomeswere found, and minichromosomes were detected in 97.2% and 98.1% of thecells, respectively. The minichromosomes have been maintained for over150 cell generations. They do contain the remaining portion of theformerly dicentric chromosome.

[0298] Multiple copies of telomeric DNA sequences were detected in themarker centromeric region of the remaining portion of the formerlydicentric chromosome by in situ hybridization. This indicates that mousetelomeric sequences were coamplified with the foreign DNA sequences.These stable minichromosome-carrying cell lines provide direct evidencethat the extra centromere is functioning and is capable of maintainingthe minichromosomes [see, U.S. Pat. No. 5,288,625].

[0299] The chromosome breakage in the EC3/7 cells, which separates theneo-centromere from the mouse chromosome, occurred in the G-bandpositive “foreign” DNA region. This is supported by the observation oftraces of λ and human DNA sequences at the broken end of the formerlydicentric chromosome. Comparing the G-band pattern of the chromosomefragment carrying the neo-centromere with that of the stableneo-minichromosome, reveals that the neo-minichromosome is an invertedduplicate of the chromosome fragment that bears the neo-centromere. Thisis also evidenced by the observation that although theneo-minichromosome carries only one functional centromere, both ends ofthe minichromosome are heterochromatic, and mouse satellite DNAsequences were found in these heterochromatic regions by in situhybridization.

[0300] These two cell lines, EC3/7C5 and EC3/7C6, thus carry aselectable mammalian minichromosome [MMCneo] with a centromere linked toa dominant marker gene [Hadlaczky et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:8106-8110]. MMCneo is intended to be used as a vector forminichromosome-mediated gene transfer and has been used as model of aminichromosome-based vector system.

[0301] Long range mapping studies of the MMCneo indicated that human DNAand the neomycin-resistance gene constructs integrated into the mousechromosome separately, followed by the amplification of the chromosomeregion that contains the exogenous DNA. The MMCneo contains about 30-50copies of the λCM8 and λgtWESneo DNA in the form of approximately 160 kbrepeated blocks, which together cover at least a 3.5 Mb region. Inaddition to these, there are mouse telomeric sequences [Praznovszky etal (1991) Proc. Natl. Acad. Sci. U.S.A. 88:11042-11046] and any DNA ofmouse origin necessary for the correct higher-ordered structuralorganization of chromatids.

[0302] Using a chromosome painting probe mCPE1.51 [mouse longinterspersed repeated DNA], which recognizes exclusively euchromaticmouse DNA, detectable amounts of interspersed repeat sequences werefound on the MMCneo by in situ hybridization. The neo-centromere isassociated with a small but detectable amount of satellite DNA. Thechromosome breakage that separates the neo-centromere from the mousechromosome occurs in the “foreign” DNA region. This is demonstrated bythe presence of A and human DNA at the broken end of the formerlydicentric chromosome. At both ends of the MMCneo, however, there aretraces of mouse major satellite DNA as evidenced by in situhybridization. This observation suggests that the doubling in size ofthe chromosome fragment carrying the neo-centromere during thestabilization of the MMCneo is a result of an inverted duplication.Although mouse telomere sequences, which coamplified with the exogenousDNA sequences during the neo-centromere formation, may providesufficient telomeres for the MMCneo, the duplication could have suppliedthe functional telomeres for the minichromosome.

[0303] The nucleotide sequence of portions of the neo-minichromosomeswas determined as follows. Total DNA was isolated from EC3/7C5 cellsaccording to standard procedures. The DNA was subjected to nucleic acidamplification using the Expand Long Template PCR system [BoehringerMannheim] according to the manufacturer's procedures. The amplificationprocedure required only a single 33-mer oligonucleotide primercorresponding to sequence in a region of the phage λ right arm, which iscontained in the neo-minichromosome. The sequence of thisoligonucleotide is set forth as the first 33 nucleotides of SEQ ID No.13. Because the neo-minichromosome contains a series of inverted repeatsof this sequence, the single oligonucleotide was used as a forward andreverse primer resulting in amplification of DNA positioned between setsof inverted repeats of the phage λ DNA. Three products were obtainedfrom the single amplification reaction, which suggests that the sequenceof the DNA located between different sets of inverted repeats maydiffer. In a repeating nucleic acid unit within an artificialchromosome, minor differences may be present and may occur duringculturing of cells containing the artificial chromosome. For example,base pair changes may occur as well as integration of mobile geneticelements and deletions of repeated sequences.

[0304] Each of the three products was subjected to DNA sequenceanalysis. The sequences of the three products are set forth in SEQ IDNos. 13, 14, and 15, respectively. To be certain that the sequencedproducts were amplified from the neo-minichromosome, controlamplifications were conducted using the same primers on DNA isolatedfrom negative control cell lines (mouse Ltk- cells) lackingminichromosomes and the formerly dicentric chromosome, and positivecontrol cell lines [the mouse-hamster hybrid cell line GB43 generated bytreating 19C5xHa4 cells (see FIG. 4) with BrdU followed by growth inG418-containing selective medium and retreatment with BrdU] containingthe neo-minichromosome only. Only the positive control cell line yieldedthe three amplification products; no amplification product was detectedin the negative control reaction. The results obtained in the positivecontrol amplification also demonstrate that the neo-minichromosome DNA,and not the fragment of the formerly dicentric mouse chromosome, wasamplified.

[0305] The sequences of the three amplification products were comparedto those contained in the Genbank/EMBL database. SEQ ID Nos. 13 and 14showed high (˜96%) homology to portions of DNA from intracisternalA-particles from mouse. SEQ ID No. 15 showed no significant homologywith sequences available in the database. All three of these sequencesmay be used for generating gene targeting vectors as homologous DNAs tothe neo-minichromosome.

[0306] C. Isolation and partial purification of minichromosomes

[0307] Mitotic chromosomes of EC3/7C5 cells were isolated as describedby Hadlaczky et al. [(1981) Chromosoma 81:537-555], using aglycine-hexylene glycol buffer system [Hadlaczky et al. (1982)Chromosoma 86:643-659]. Chromosome suspensions were centrifuged at1,200×g for 30 minutes. The supernatant containing minichromosomes wascentrifuged at 5,000×g for 30 minutes and the pellet was resuspended inthe appropriate buffer. Partially purified minichromosomes were storedin 50% glycerol at −20° C.

[0308] D. Stability of the MMCneo maintenance and neo expression

[0309] EC3/7C5 cells grown in non-selective medium for 284 days and thentransferred to selective medium containing 400 μg/ml G418 showed a 96%plating efficiency (colony formation) compared to control cells culturedpermanently in the presence of G418. Cytogenetic analysis indicated thatthe MMCneo is stably maintained at one copy per cell under selective andnon-selective culture conditions. Only two metaphases with two MMCneowere found in 2,270 metaphases analyzed.

[0310] Southern hybridization analysis showed no detectable changes inDNA restriction patterns, and similar hybridization intensities wereobserved with a neo probe when DNA from cells grown under selective ornon-selective culture conditions were compared.

[0311] Northern analysis of RNA transcripts from the neo gene isolatedfrom cells grown under selective and non-selective conditions showedonly minor and not significant differences. Expression of the neo genepersisted in EC3/7C5 cells maintained in F-12 medium free of G418 for290 days under non-selective culture conditions. The long-termexpression of the neo gene(s) from the minichromosome may be influencedby the nuclear location of the MMCneo. In situ hybridization experimentsrevealed a preferential peripheral location of the MMCneo in theinterphase nucleus. In more than 60% of the 2,500 nuclei analyses, theminichromosome was observed at the perimeter of the nucleus near thenuclear envelope.

EXAMPLE 3

[0312] Minichromosome Transfer and Production of the A-neo-chromosome

[0313] A. Minichromosome transfer

[0314] The neo-minichromosome [referred to as MMCneo, FIG. 2C] has beenused for gene transfer by fusion of minichromosome-containing cells[EC3/7C5 or EC3/7C6] with different mammalian cells, including hamsterand human. Thirty-seven stable hybrid cell lines have been produced. Allestablished hybrid cell lines proved to be true hybrids as evidenced byin situ hybridization using biotinylated human, and hamster genomic, orpMCPE1.51 mouse long interspersed repeated DNA probes for “chromosomepainting”. The MMCneo has also been successfully transferred into mouseA9, L929 and pluripotent F9 teratocarcinoma cells by fusion ofmicrocells derived from EC3/7C5 cells. Transfer was confirmed by PCR,Southern blotting and in situ hybridization with minichromosome-specificprobes. The cytogenetic analysis confirmed that, as expected formicrocell fusion, a few cells [1-5%] received [or retained] the MMCneo.

[0315] These results demonstrate that the MMCneo is tolerated by a widerange of cells. The prokaryotic genes and the extra dosage for the humanand λ sequences carried on the minichromosome seem to be notdisadvantageous for tissue culture cells.

[0316] The MMCneo is the smallest chromosome of the EC3/7C5 genome andis estimated to be approximately 20-30 Mb, which is significantlysmaller than the majority of the host cell (mouse) chromosomes. Byvirtue of the smaller size, minichromosomes can be partially purifiedfrom a suspension of isolated chromosomes by a simple differentialcentrifugation. In this way, minichromosome suspensions of 15-20% purityhave been prepared. These enriched minichromosome preparations can beused to introduce, such as by microinjection or lipofection, theminichromosome into selected target cells. Target cells includetherapeutic cells that can be use in methods of gene therapy, and alsoembryonic cells for the preparation of transgenic (non-human) animals.

[0317] The MMCneo is capable of autonomous replication, is stablymaintained in cells, and permits persistent expression of the neogene(s), even after long-term culturing under non-selective conditions.It is a non-integrative vector that appears to occupy a territory nearthe nuclear envelope. Its peripheral localization in the nucleus mayhave an important role in maintaining the functional integrity andstability of the MMCneo. Functional compartmentalization of the hostnucleus may have an effect on the function of foreign sequences. Inaddition, MMCneo contains megabases of λ DNA sequences that should serveas a target site for homologous recombination and thus integration ofdesired gene(s) into the MMCneo. It can be transferred by cell andmicrocell fusion, microinjection, electroporation, lipid-mediatedcarrier systems or chromosome uptake. The neo-centromere of the MMCneois capable of maintaining and supporting the normal segregation of alarger 150-200 Mb λneo-chromosome. This result demonstrates that theMMCneo chromosome should be useful for carrying large fragments ofheterologous DNA.

[0318] B. Production of the λneo-chromosome

[0319] In the hybrid cell line KE1-2/4 made by fusion of EC3/7 andChinese hamster ovary cells [FIG. 2], the separation of theneo-centromere from the dicentric chromosome was associated with afurther amplification process. This amplification resulted in theformation of a stable chromosome of average size [i.e., theλneo-chromosome; see, Praznovszky et al (1991) Proc. Natl. Acad. Sci.U.S.A. 88:11042-11046]. The λneo-chromosome carries a terminally locatedfunctional centromere and is composed of seven large ampliconscontaining multiple copies of λ, human, bacterial, and mouse DNAsequences [see FIG. 2]. The amplicons are separated by mouse majorsatellite DNA [Praznovszky et al. (1991) Proc. Natl. Acad. Sci. U.S.A.88:11042-11046] which forms narrow bands of constitutive heterochromatinbetween the amplicons.

EXAMPLE 4

[0320] Formation of the “Sausage Chromosome” [SC]

[0321] The findings set forth in the above EXAMPLES demonstrate that thecentromeric region of the mouse chromosome 7 has the capacity forlarge-scale amplification [other results indicate that this capacity isnot unique to chromosome 7]. This conclusion is further supported byresults from cotransfection experiments, in which a second dominantselectable marker gene and a non-selected marker gene were introducedinto EC3/7C5 cells carrying the formerly dicentric chromosome 7 and theneo-minichromosome. The EC3/7C5 cell line was transformed with λ phageDNA, a hygromycin-resistance gene construct [pH132], and aβ-galactosidase gene construct [pCH110]. Stable transformants wereselected in the presence of high concentrations [400 μg/ml] HygromycinB, and analyzed by Southern hybridization. Established transformant celllines showing multiple copies of integrated exogenous DNA were studiedby in situ hybridization to localize the integration site(s), and byLacZ staining to detect β-galactosidase expression.

[0322] A. Materials and methods

[0323] 1. Construction of pH132

[0324] The pH132 plasmid carries the hygromycin B resistance gene andthe anti-HIV-1 gag ribozyme [see, SEQ ID NO. 6 for DNA sequence thatcorresponds to the sequence of the ribozyme] under control of theβ-actin promoter. This plasmid was constructed from pHyg plasmid [Sugdenet al. (1985) Mol. Cell. Biol. 5:410-413; a gift from Dr. A. D. Riggs,Beckman Research Institute, Duarte; see, also, e.g., U.S. Pat. No.4,997,764], and from pPC-RAG12 plasmid [see, Chang et al (1990) ClinBiotech 2:23-31; provided by Dr. J. J. Rossi, Beckman ResearchInstitute, Duarte; see, also U.S. Pat. Nos. 5,272,262, 5,149,796 and5,144,019, which describes the anti-HIV gag ribozyme and construction ofa mammalian expression vector containing the ribozyme insert linked tothe β-actin promoter and SV40 late gene transcriptional termination andpolyA signals]. Construction of pPC-RAG12 involved insertion of theribozyme insert flanked by BamHI linkers was into BamHI-digestedpHβ-Apr-1gpt [see, Gunning et al. (1987) Proc. Natl. Acad. Sci. U.S.A.84:4831-4835, see, also U.S. Pat. No. 5,144,019].

[0325] Plasmid pH132 was constructed as follows. First, pPC-RAG12[described by Chang et al. (1990) Clin. Biotech. 2:23-31] was digestedwith BamHI to excise a fragment containing an anti-HIV ribozyme gene[referred to as ribozyme D by Chang et al. [(1990) Clin. Biotech.2:23-31]; see also U.S. Pat. No. 5,144,019 to Rossi et al..,particularly FIG. 4 of the patent] flanked by the human β-actin promoterat the 5′ end of the gene and the SV40 late transcriptional terminationand polyadenylation signals at the 3′ end of the gene. As described byChang et al. [(1990) Clin. Biotech. 2:23-31], ribozyme D is targeted forcleavage of the translational initiation region of the HIV gag gene.This fragment of pPC-RAG12 was subcloned into pBluescript-KS(+)[Stratagene, La Jolla, Calif.] to produce plasmid 132. Plasmid 132 wasthen digested with XhoI and EcoRI to yield a fragment containing theribozyme D gene flanked by the β-actin promoter at the 5′ end and theSV40 termination and polyadenylation signals at the 3′ end of the gene.This fragment was ligated to the largest fragment generated by digestionof pHyg [Sugden et al. (1985) Mol. Cell. Biol. 5:410-413] with EcoRI andSalI to yield pH132. Thus, pH132 is an ˜9.3 kb plasmid containing thefollowing elements: the β-actin promoter linked to an anti-HIV ribozymegene followed by the SV40 termination and polyadenylation signals, thethymidine kinase gene promoter linked to the hygromycin-resistance genefollowed by the thymidine kinase gene polyadenylation signal, and the E.coli ColE1 origin of replication and the ampicillin-resistance gene.

[0326] The plasmid pHyg [see, e., U.S. Pat. Nos. 4,997,764, 4,686,186and 5,162,215], which confers resistance to hygromycin B usingtranscriptional controls from the HSV-1 tk gene, was originallyconstructed from pKan2 [Yates et al. (1984) Proc. Natl. Acad. Sci.U.S.A. 81:3806-3810] and pLG89 [see, Gritz et al. (1983) Gene25:179-188]. Briefly pKan2 was digested with SmaI and BglII to removethe sequences derived from transposon Tn5. The hygromycin-resistance hphgene was inserted into the digested pKan2 using blunt-end ligation atthe SnaI site and “sticky-end” ligation [using 1 Weiss unit of T4 DNAligase (BRL) in 20 microliter volume] at the BglII site. The SmaI andBglII sites of pKan2 were lost during ligation.

[0327] The resulting plasmid pH132, produced from introduction of theanti-HIV ribozyme construct with promoter and polyA site into pHyg,includes the anti-HIV ribozyme under control of the β-actin promoter aswell as the hygromycin-resistance gene under control of the TK promoter.

[0328] 2. Chromosome banding

[0329] Trypsin G-banding of chromosomes was performed as described inEXAMPLE 1.

[0330] 3. Cell cultures

[0331] TF1004G19 and TF1004G-19C5 mouse cells and the 19C5xHa4 hybrid,described below, and its sublines were cultured in F-12 mediumcontaining 400 μg/ml Hygromycin B [Calbiochem].

[0332] B. Cotransfection of EC3/7C5 to produce TF1004G19

[0333] Cotransfection of EC3/7C5 cells with plasmids [pH 132, pCH110available from Pharmacia, see, also Hall et al. (1983) J. Mol. Appl.Gen. 2:101-109] and with λ DNA [λcl 857 Sam 7(New England Biolabs)] wasconducted using the calcium phosphate DNA precipitation method [see,e.g., Chen et al. (1987) Mol. Cell. Biol. 7:2745-2752], using 2-5 μgplasmid DNA and 20 μg λ phage DNA per 5×10⁶ recipient cells.

[0334] C. Cell lines containing the sausage chromosome

[0335] Analysis of one of the transformants, designated TF1004G19,revealed that it has a high copy number of integrated pH132 and pCH110sequences, and a high level of β-galactosidase expression. G-banding andin situ hybridization with a human probe [CM8; see, e.g., U.S.application Ser. No. 08/375,271] revealed unexpectedly that integrationhad occurred in the formerly dicentric chromosome 7 of the EC3/7C5 cellline. Furthermore, this chromosome carried a newly formedheterochromatic chromosome arm. The size of this heterochromatic armvaried between ˜150 and ˜800 Mb in individual metaphases.

[0336] By single cell cloning from the TF1004G19 cell line, a subcloneTF1004G-19C5 [FIG. 2D], which carries a stable chromosome 7 with a˜100-150 Mb heterochromatic arm [the sausage chromosome] was obtained.This cell line has been deposited in the ECACC under Accession No.96040926. This chromosome arm is composed of four to five satellitesegments rich in satellite DNA, and evenly spaced integratedheterologous “foreign” DNA sequences. At the end of the compactheterochromatic arm of the sausage chromosome, a less condensedeuchromatic terminal segment is regularly observed. This subclone wasused for further analyses.

[0337] D. Demonstration that the sausage chromosome is derived from theformerly dicentric chromosome

[0338] In situ hybridization with λ phage and pH132 DNA on theTF1004G-19C5 cell line showed positive hybridization only on theminichromosome and on the heterochromatic arm of the “sausage”chromosome [FIG. 2D]. It appears that the “sausage” chromosome [hereinalso referred to as the SC] developed from the formerly dicentricchromosome (FD) of the EC3/7C5 cell line.

[0339] To establish this, the integration sites of pCH110 and pH132plasmids were determined. This was accomplished by in situ hybridizationon these cells with biotin-labeled subfragments of thehygromycin-resistance gene and the β-galactosidase gene. Bothexperiments resulted in narrow hybridizing bands on the heterochromaticarm of the sausage chromosome. The same hybridization pattern wasdetected on the sausage chromosome using a mixture of biotin-labeled λprobe and pH132 plasmid, proving the cointegration of λ phages, pH132and pCH 110 plasmids.

[0340] To examine this further, the cells were cultured in the presenceof the DNA-binding dye Hoechst 33258. Culturing of mouse cells in thepresence of this dye results in under-condensation of the pericentricheterochromatin of metaphase chromosomes, thereby permitting betterobservation of the hybridization pattern. Using this technique, theheterochromatic arm of the sausage chromosome of TF1004G-19C5 cellsshowed regular under-condensation revealing the details of the structureof the “sausage” chromosome by in situ hybridization. Results of in situhybridization on Hoechst-treated TF1004G-19C5 cells with biotin-labeledsubfragments of hygromycin-resistance and β-galactosidase genes showsthat these genes are localized only in the heterochromatic arm of thesausage chromosome. In addition, an equal banding hybridization patternwas observed. This pattern of repeating units [amplicons] clearlyindicates that the sausage chromosome was formed by an amplificationprocess and that the λ phage, pH 132 and pCH 110 plasmid DNA sequencesborder the amplicons.

[0341] In another series of experiments using fluorescence in situhybridization [FISH] carried out with mouse major satellite DNA, themain component of the mouse pericentric heterochromatin, the resultsconfirmed that the amplicons of the sausage chromosome are primarilycomposed of satellite DNA.

[0342] E. The sausage chromosome has one centromere

[0343] To determine whether mouse centromeric sequences had participatedin the amplification process forming the “sausage” chromosome andwhether or not the amplicons carry inactive centromeres, in situhybridization was carried out with mouse minor satellite DNA. Mouseminor satellite DNA is localized specifically near the centromeres ofall mouse chromosomes. Positive hybridization was detected in all mousecentromeres including the sausage chromosome, which, however, onlyshowed a positive signal at the beginning of the heterochromatic arm.

[0344] Indirect immunofluorescence with a human anti-centromere antibody[LU 851] which recognizes only functional centromeres [see, e.g.,Hadlaczky et al. (1989) Chromosoma 97:282-288] proved that the sausagechromosome has only one active centromere. The centromere comes from theformerly dicentric part of the chromosome and co-localizes with the insitu hybridization signal of the mouse minor DNA probe.

[0345] F. The selected and non-selected heterologous DNA in theheterochromatin of the sausage chromosome is expressed

[0346] 1. High levels of the heterologous genes are expressed

[0347] The TF1004G-19C5 cell line thus carries multiple copies ofhygromycin-resistance and β-galactosidase genes localized only in theheterochromatic arm of the sausage chromosome. The TF1004G-19C5 cellscan grow very well in the presence of 200 μg/ml or even 400 μg/mlhygromycin B. [The level of expression was determined by Northernhybridization with a subfragment of the hygromycin-resistance gene andsingle copy gene.]

[0348] The expression of the non-selected β-galactosidase gene in theTF1004G-19C5 transformant was detected with LacZ staining of the cells.By this method one hundred percent of the cells stained dark blue,showing that there is a high level of β-galactosidase expression in allof TF1004G-19C5 cells.

[0349] 2. The heterologous genes that are expressed are in theheterochromatin of the sausage chromosome

[0350] To demonstrate that the genes localized in the constitutiveheterochromatin of the sausage chromosome provide the hygromycinresistance and the LacZ staining capability of TF1004G-19C5transformants [i.e. β-gal expression], PEG-induced cell fusion betweenTF1004G-19C5 mouse cells and Chinese hamster ovary cells was performed.The hybrids were selected and maintained in HAT medium containing G418[400 μg/ml] and hygromycin [200 μg/ml]. Two hybrid clones designated19C5xHa3 and 19C5xHa4, which have been deposited in the ECACC underAccession No. 96040927, were selected. Both carry the sausage chromosomeand the minichromosome.

[0351] Twenty-seven single cell derived colonies of the 19C5xHa4 hybridwere maintained and analyzed as individual subclones. In situhybridization with hamster and mouse chromosome painting probes andhamster chromosome 2-specific probes verified that the 19C5xHa4 clonecontains the complete Chinese hamster genome and a partial mouse genome.All 19C5xHa4 subclones retained the hamster genome, but differentsubclones showed different numbers of mouse chromosomes indicating thepreferential elimination of mouse chromosomes.

[0352] To promote further elimination of mouse chromosomes, hybrid cellswere repeatedly treated with BrdU. The BrdU treatments, whichdestabilize the genome, result in significant loss of mouse chromosomes.The BrdU-treated 19C5xHa4 hybrid cells were divided to three groups. Onegroup of the hybrid cells (GH) was maintained in the presence ofhygromycin (200 μg/ml) and G418 (400 μg/ml), and the other two groups ofthe cells were cultured under G418 (G) or hygromycin (H) selectionconditions to promote the elimination of the sausage chromosome orminichromosome.

[0353] One month later, single cell derived subclones were establishedfrom these three subcultures of the 19C5xHa4 hybrid line. The subcloneswere monitored by in situ hybridization with biotin-labeled A phage andhamster chromosome painting probes. Four individual clones [G2B5, G3C5,G4D6, G2B4] selected in the presence of G418 that had lost the sausagechromosome but retained the minichromosome were found. Under hygromycinselection only one subclone [H1D3] lost the minichromosome. In thisclone the megachromosome [see Example 5] was present.

[0354] Since hygromycin-resistance and β-galactosidase genes werethought to be expressed from the sausage chromosome, the expression ofthese genes was analyzed in the four subclones that had lost the sausagechromosome. In the presence of 200 μg/ml hygromycin, one hundred percentof the cells of four individual subclones died. In order to detect theβ-galactosidase expression hybrid, subclones were analyzed by LacZstaining. One hundred percent of the cells of the four subclones thatlost the sausage chromosome also lost the LacZ staining capability. Allof the other hybrid subclones that had not lost the sausage chromosomeunder the non-selective culture conditions showed positive LacZstaining.

[0355] These findings demonstrate that the expression ofhygromycin-resistance and β-galactosidase genes is linked to thepresence of the sausage chromosome. Results of in situ hybridizationsshow that the heterologous DNA is expressed from the constitutiveheterochromatin of the sausage chromosome.

[0356] In situ hybridization studies of three other hybrid subclones[G2C6, G2D1, and G4D5] did not detect the presence of the sausagechromosome. By the LacZ staining method, some stained cells weredetected in these hybrid lines, and when these subclones weretransferred to hygromycin selection some colonies survived. Cytologicalanalysis and in situ hybridization of these hygromycin-resistantcolonies revealed the presence of the sausage chromosome, suggestingthat only the cells of G2C6, G2D1 and G4D5 hybrids that had not lost thesausage chromosome were able to preserve the hygromycin resistance andβ-galactosidase expression. These results confirmed that the expressionof these genes is linked to the presence of the sausage chromosome. Thelevel of β-galactosidase expression was determined by the immunoblottechnique using a monoclonal antibody.

[0357] Hygromycin resistance and β-galactosidase expression of the cellswhich contained the sausage chromosome were provided by the geneslocalized in the mouse pericentric heterochromatin. This wasdemonstrated by performing Southern DNA hybridizations on the hybridcells that lack the sausage chromosome using PCR-amplified subfragmentsof hygromycin-resistance and β-galactosidase genes as probes. None ofthe subclones showed hybridization with these probes; however, all ofthe analyzed clones contained the minichromosome. Other hybrid clonesthat contain the sausage chromosome showed intense hybridization withthese DNA probes. These results lead to the conclusion that hygromycinresistance and β-galactosidase expression of the cells that contain thesausage chromosome were provided by the genes localized in the mousepericentric heterochromatin.

EXAMPLE 5

[0358] The Gigachromosome

[0359] As described in Example 4, the sausage chromosome was transferredinto Chinese hamster cells by cell fusion. Using Hygromycin B/HAT andG418 selection, two hybrid clones 19C5xHa3 and 19C5xHa4 were producedthat carry the sausage chromosome. In situ hybridization, using hamsterand mouse chromosome-painting probes and a hamster chromosome 2-specificprobe, verified that clone 19C5xHa4 contains a complete Chinese hamstergenome as well as partial mouse genomes. Twenty-seven separate coloniesof 19C5xHa4 cells were maintained and analyzed as individual subclones.Twenty-six out of 27 subclones contained a morphologically unchangedsausage chromosome.

[0360] In one subclone of the 19C5xHa3 cell line, 19C5xHa47 [see FIG.2E], the heterochromatic arm of the sausage chromosome became unstableand showed continuous intrachromosomal growth. In extreme cases, theamplified chromosome arm exceeded 1000 Mb in size (gigachromosome).

EXAMPLE 6

[0361] The Stable Megachromosome

[0362] A. Generation of cell lines containing the megachromosome

[0363] All 19C5xHa4 subclones retained a complete hamster genome, butdifferent subclones showed different numbers of mouse chromosomes,indicating the preferential elimination of mouse chromosomes. Asdescribed in Example 4, to promote further elimination of mousechromosomes, hybrid cells were treated with BrdU, cultured under G418(G) or hygromycin (H) selection conditions followed by repeatedtreatment with 10⁻⁴ M BrdU for 16 hours and single cell subclones wereestablished. The BrdU treatments appeared to destabilize the genome,resulting in a change in the sausage chromosome as well. A gradualincrease in a cell population in which a further amplification hadoccurred was observed. In addition to the ˜100-150 Mb heterochromaticarm of the sausage chromosome, an extra centromere and a ˜150-250 Mbheterochromatic chromosome arm were formed, which differed from those ofmouse chromosome 7. By the acquisition of another euchromatic terminalsegment, a new submetacentric chromosome (megachromosome) was formed.Seventy-nine individual subclones were established from theseBrdU-treated cultures by single-cell cloning: 42 subclones carried theintact megachromosome, 5 subclones carried the sausage chromosome, andin 32 subclones fragments or translocated segments of the megachromosomewere observed. Twenty-six subclones that carried the megachromosome werecultured under non-selective conditions over a two-month period. In 19out of 26 subclones, the megachromosome was retained. Those subcloneswhich lost the megachromosomes all became sensitive to Hygromycin B andhad no β-galactosidase expression, indicating that both markers werelinked to the megachromosome.

[0364] Two sublines (G3D5 and H1D3), which were chosen for furtherexperiments, showed no changes in the morphology of the megachromosomeduring more than 100 generations under selective conditions. The G3D5cells had been obtained by growth of 19C5xHa4 cells in G418-containingmedium followed by repeated BrdU treatment, whereas H1D3 cells had beenobtained by culturing 19C5xHa4 cells in hygromycin-containing mediumfollowed by repeated BrdU treatment.

[0365] B. Structure of the megachromosome

[0366] The following results demonstrate that, apart from theeuchromatic terminal segments, the integrated foreign DNA (and as in theexemplified embodiments, rDNA sequence), the whole megachromosome isconstitutive heterochromatin, containing a tandem array of at least 40[˜7.5 Mb] blocks of mouse major satellite DNA [see FIGS. 2 and 3]. Foursatellite DNA blocks are organized into a giant palindrome [amplicon]carrying integrated exogenous DNA sequences at each end. The long andshort arms of the submetacentric megachromosome contains 6 and 4amplicons, respectively. It is of course understood that the specificorganization and size of each component can vary among species, and alsothe chromosome in which the amplification event initiates.

[0367] 1. The megachromosome is composed primarily of heterochromatin

[0368] Except for the terminal regions and the integrated foreign DNA,the megachromosome is composed primarily of heterochromatin. This wasdemonstrated by C-banding of the megachromosome, which resulted inpositive staining characteristic of constitutive heterochromatin. Apartfrom the terminal regions and the integrated foreign DNA, the wholemegachromosome appears to be heterochromatic. Mouse major satellite DNAis the main component of the pericentric, constitutive heterochromatinof mouse chromosomes and represents ˜10% of the total DNA [Waring et al.(1966) Science 154:791-794]. Using a mouse major satellite DNA probe forin situ hybridization, strong hybridization was observed throughout themegachromosome, except for its terminal regions. The hybridizationshowed a segmented pattern: four large blocks appeared on the short armand usually 4-7 blocks were seen on the long arm. By comparing thesesegments with the pericentric regions of normal mouse chromosomes thatcarry ˜15 Mb of major satellite DNA, the size of the blocks of majorsatellite DNA on the megachromosome was estimated to be ˜30 Mb.

[0369] Using a mouse probe specific to euchromatin [pMCPE1.51; a mouselong interspersed repeated DNA probe], positive hybridization wasdetected only on the terminal segments of the megachromosome of the H1D3hybrid subline. In the G3D5 hybrids, hybridization with ahamster-specific probe revealed that several megachromosomes containedterminal segments of hamster origin on the long arm. This observationindicated that the acquisition of the terminal segments on thesechromosomes happened in the hybrid cells, and that the long arm of themegachromosome was the recently formed one arm. When a mouse minorsatellite probe was used, specific to the centromeres of mousechromosomes [Wong et al. (1988) Nucl. Acids Res. 16:11645-11661], astrong hybridization signal was detected only at the primaryconstriction of the megachromosome, which colocalized with the positiveimmuno-fluorescence signal produced with human anti-centromere serum[LU851].

[0370] In situ hybridization experiments with pH132, pCH110, and λ DNAprobes revealed that all heterologous DNA was located in the gapsbetween the mouse major satellite DNA segments. Each segment of mousemajor satellite DNA was bordered by a narrow band of integratedheterologous DNA, except at the second segment of the long arm where adouble band of heterologous DNA existed, indicating that the majorsatellite DNA segment was missing or considerably reduced in size here.This chromosome region served as a useful cytological marker inidentifying the long arm of the megachromosome. At a frequency of 10⁻⁴,“restoration” of these missing satellite DNA blocks was observed in onechromatid, when the formation of a whole segment on one chromatidoccurred.

[0371] After Hoechst 33258 treatment (50 μg/ml for 16 hours), themegachromosome showed undercondensation throughout its length except forthe terminal segments. This made it possible to study the architectureof the megachromosome at higher resolution. In situ hybridization withthe mouse major satellite probe on undercondensed megachromosomesdemonstrated that the ˜30 Mb major satellite segments were composed offour blocks of ˜7.5 Mb separated from each other by a narrow band ofnon-hybridizing sequences [FIG. 3]. Similar segmentation can be observedin the large block of pericentric heterochromatin in metacentric mousechromosomes from the LMTK⁻ and A9 cell lines.

[0372] 2. The megachromosome is composed of segments containing twotandem ˜7.5 Mb blocks followed by two inverted blocks

[0373] Because of the asymmetry in thymidine content between the twostrands of the DNA of the mouse major satellite, when mouse cells aregrown in the presence of BrdU for a single S phase, the constitutiveheterochromatin shows lateral asymmetry after FPG staining. Also, in the19C5xHa4 hybrids, the thymidine-kinase [Tk] deficiency of the mousefibroblast cells was complemented by the hamster Tk gene, permittingBrdU incorporation experiments.

[0374] A striking structural regularity in the megachromosome wasdetected using the FPG technique. In both chromatids, alternating darkand light staining that produced a checkered appearance of themegachromosome was observed. A similar picture was obtained by labellingwith fluorescein-conjugated anti-BrdU antibody. Comparing these picturesto the segmented appearance of the megachromosome showed that one darkand one light FPG band corresponded to one ˜30 Mb segment of themegachromosome. These results suggest that the two halves of the ˜30 Mbsegment have an inverted orientation. This was verified by combining insitu hybridization and immunolabelling of the incorporated BrdU withfluorescein-conjugated anti-BrdU antibody on the same chromosome. Sincethe ˜30 Mb segments [or amplicons] of the megachromosome are composed offour blocks of mouse major satellite DNA, it can be concluded that twotandem ˜7.5 Mb blocks are followed by two inverted blocks within onesegment.

[0375] Large-scale mapping of megachromosome DNA by pulsed-fieldelectrophoresis and Southern hybridization with “foreign” DNA probesrevealed a simple pattern of restriction fragments. Using endonucleaseswith none, or only a single cleavage site in the integrated foreign DNAsequences, followed by hybridization with a hyg probe, 1-4 predominantfragments were detected. Since the megachromosome contains 10-12amplicons with an estimated 3-8 copies of hyg sequences per amplicon(30-90 copies per megachromosome), the small number of hybridizingfragments indicates the homogeneity of DNA in the amplified segments.

[0376] 3. Scanning electron microscopy of the megachromosome confirmedthe above findings

[0377] The homogeneous architecture of the heterochromatic arms of themegachromosome was confirmed by high resolution scanning electronmicroscopy. Extended arms of megachromosomes, and the pericentricheterochromatic region of mouse chromosomes, treated with Hoechst 33258,showed similar structure. The constitutive heterochromatic regionsappeared more compact than the euchromatic segments. Apart from theterminal regions, both arms of the megachromosome were completelyextended, and showed faint grooves, which should correspond to theborder of the satellite DNA blocks in the non-amplified chromosomes andin the megachromosome. Without Hoechst treatment, the grooves seemed tocorrespond to the amplicon borders on the megachromosome arms. Inaddition, centromeres showed a more compact, finely fibrous appearancethan the surrounding heterochromatin.

[0378] 4. The megachromosome of 1B3 cells contains rRNA gene sequence

[0379] The sequence of the megachromosome in the region of the sites ofintegration of the heterologous DNA was investigated by isolation ofthese regions through using cloning methods and sequence analysis of theresulting clones. The results of this analysis revealed that theheterologous DNA was located near mouse ribosomal RNA gene (i.e., rDNA)sequences contained in the megachromosome.

[0380] a. Cloning of regions of the megachromosomes in whichheterologous DNA had integrated

[0381] Megachromosomes were isolated from 1B3 cells (which weregenerated by repeated BrdU treatment and single cell cloning of H1xHE41cells (see FIG. 4) and which contain a truncated megachromosome) usingfluorescence-activated cell sorting methods as described herein (seeExample 10). Following separation of the SATACs (megachromosomes) fromthe endogenous chromosomes, the isolated megachromosomes were stored inGH buffer (100 mM glycine, 1% hexylene glycol, pH 8.4-8.6 adjusted withsaturated calcium hydroxide solution; see Example 10) and centrifugedinto an agarose bed in 0.5 M EDTA.

[0382] Large-scale mapping of the megachromosome around the area of thesite of integration of the heterologous DNA revealed that it is enrichedin sequence containing rare-cutting enzyme sites, such as therecognition site for NotI. Additionally, mouse major satellite DNA(which makes up the majority of the megachromosome) does not containNotI recognition sites. Therefore, to facilitate isolation of regions ofthe megachromosome associated with the site of integration of theheterologous DNA, the isolated megachromosomes were cleaved with NotI, arare cutting restriction endonuclease with an 8-bp GC recognition site.Fragments of the megachromosome were inserted into plasmid pWE15(Stratagene, La Jolla, Calif.) as follows. Half of a 100-μl low meltingpoint agarose block (mega-plug) containing the isolated SATACs wasdigested with NotI overnight at 37° C. Plasmid pWE15 was similarlydigested with NotI overnight. The mega-plug was then melted and mixedwith the digested plasmid, ligation buffer and T4 ligase. Ligation wasconducted at 16° C. overnight. Bacterial DH5α cells were transformedwith the ligation product and transformed cells were plated onto LB/Ampplates. Fifteen to twenty colonies were grown on each plate for a totalof 189 colonies. Plasmid DNA was isolated from colonies that survivedgrowth on LB/Amp medium and was analyzed by Southern blot hybridizationfor the presence of DNA that hybridized to a pUC19 probe. This screeningmethodology assured that all clones, even clones lacking an insert butyet containing the pWE15 plasmid, would be detected. Any clonescontaining insert DNA would be expected to contain containnon-satellite, GC-rich megachromosome DNA sequences located at the siteof integration of the heterologous DNA. All colonies were positive forhybridizing DNA.

[0383] Liquid cultures of all 189 transformants were used to generatecosmid minipreps for analysis of restriction sites within the insertDNA. Slx of the original 189 cosmid clones contained an insert. Theseclones were designated as follows: 28 (˜9-kb insert), 30 (˜9-kb insert),60 (˜4-kb insert), 113 (˜9-kb insert), 157 (˜9-kb insert) and 161 (˜9-kbinsert). Restriction enzyme analysis indicated that three of the clones(113, 157 and 161) contained the same insert.

[0384] b. In situ hybridization experiments using isolated segments ofthe megachromosome as probes

[0385] Insert DNA from clones 30, 113, 157 and 161 was purified, labeledand used as probes in in situ hybridization studies of several celllines. Counterstaining of the cells with propidium iodide facilitatedidentification of the cytological sites of the hybridization signals.The locations of the signals detected within the cells are summarized inthe following table: CELL TYPE PROBE LOCATION OF SIGNAL Human LymphocyteNo. 161 4-5 pairs of acrocentric chromosomes (male) at centromericregions. Mouse Spleen No. 161 Acrocentric ends of 4 pairs ofchromosomes. EC3/7C5 Cells No. 161 Minichromosome and the end of theformerly dicentric chromosome. Pericentric heterochromatin of one of themetacentric mouse chromosomes. Centromeric region of some of the othermouse chromosomes. K20 No. 30  Ends of at least 6 pairs of ChineseHamster chromosomes. An interstitial signal Cells on a short chromosome.HB31 Cells No. 30  Acrocentric ends of at least 12 pairs (mouse-hamsterhybrid of chromosomes. Centromeres of cells derived from H1D3 certainchromosomes and the cells by repeated BrdU megachromosome. Borders ofthe treatment and single amplicons of the megachromosome. cell cloningwhich carries the megachromosome) Mouse Spleen Cells No. 30  Similar tosignal observed for probe no. 161. Centromeres of 5 pairs ofchromosomes. Weak cross- hybridization to pericentric heterochromatin.HB31 Cells No. 113 Similar to signal observed for probe no. 30. MouseSpleen Cells No. 113 Centromeric region of 5 pairs of chromosomes. K20Cells No. 113 At least 6 pairs of chromosomes. Weak signal at sometelomeres and several interspersed signals. Human Lymphocyte No. 157Similar to signal observed for probe Cells (male) no. 161.

[0386] c. Southern blot hybridization using isolated segments of themegachromosome as probes

[0387] DNA was isolated from mouse spleen tissue, mouse LMTK⁻ cells, K20Chinese hamster ovary cells, EJ30 human fibroblast cells and H1D3 cells.The isolated DNA and lambda phage DNA, was subjected to Southern blothybridization using inserts isolated from megachromosome clone nos. 30,113, 157 and 161 as probes. Plasmid pWE15 was used as a negative controlprobe. Each of the four megachromosome clone inserts hybridized in amulti-copy manner (as demonstrated by the intensity of hybridization andthe number of hybridizing bands) to all of the DNA samples, except thelambda phage DNA. Plasmid pWE15 hybridized to lambda DNA only.

[0388] d. Sequence analysis of megachromosome clone no. 161

[0389] Megachromosome clone no. 161 appeared to show the strongesthybridization in the in situ and Southern hybridization experiments andwas chosen for analysis of the insert sequence. The sequence analysiswas approached by first subcloning the insert of cosmid clone no. 161 toobtain five subclones as follows.

[0390] To obtain the end fragments of the insert of clone no. 161, theclone was digested with NotI and BamHI and ligated withNotI/BamHI-digested pBluescript KS (Stratagene, La Jolla, Calif.). Twofragments of the insert of clone no. 161 were obtained: a 0.2-kb and a0.7-kb insert fragment. To subclone the internal fragment of the insertof clone no. 161, the same digest was ligated with BamHI-digested pUC19.Three fragments of the insert of clone no. 161 were obtained: a 0.6-kb,a 1.8-kb and a 4.8-kb insert fragment.

[0391] The ends of all the subcloned insert fragments were firstsequenced manually. However, due to their extremely high GC content,autoradiographs were difficult to interpret and sequencing was repeatedusing an ABI sequencer and the dye-terminator cycle protocol. Acomparison of the sequence data to sequences in the GENBANK databaserevealed that the insert of clone no. 161 corresponds to an internalsection of the mouse ribosomal RNA gene (rDNA) repeat unit betweenpositions 7551-15670 as set forth in GENBANK accession no. X82564, whichis provided as SEQ ID NO. 16 herein. The sequence data obtained for theinsert of clone no. 161 is set forth in SEQ ID NOS. 18-24. Specifically,the individual subclones corresponded to the following positions inGENBANK accession no. X82564 (i.e., SEQ ID NO. 16) and in SEQ ID NOs.18-24: in X82564 Subclone Start End Site SEQ ID No. 161k1 7579 7755NotI, BamHI 18 161m5 7756 8494 BamHI 19 161m7 8495 10231 BamHI 20 (showsonly sequence corresponding to nt. 8495- 8950), 21 (shows only sequencecorresponding to nt. 9851-10231) 161m12 10232 15000 BamHI 22 (shows onlysequence corresponding to nt. 10232- 10600), 23 (shows only sequencecorresponding to nt. 14267-15000), 161k2 15001 15676 NotI, BamHI 24

[0392] The sequence set forth in SEQ ID NOs. 18-24 diverges in somepositions from the sequence presented in positions 7551-15670 of GENBANKaccession no. X82564. Such divergence may be attributable to randommutations between repeat units of rDNA. The results of the sequenceanalysis of clone no. 161, which reveal that it corresponds to rDNA,correlate with the appearance of the in situ hybridization signal itgenerated in human lymphocytes and mouse spleen cells. The hybridizationsignal was clearly observed on acrocentric chromosomes in these cells,and such types of chromosomes are known to include rDNA adjacent to thepericentric satellite DNA on the short arm of the chromosome.Furthermore, rRNA genes are highly conserved in mammals as supported bythe cross-species hybridization of clone no. 161 to human chromosomalDNA.

[0393] To isolate amplification-replication control regions such asthose found in rDNA, it may be possible to subject DNA isolated frommegachromosome-containing cells, such as H1D3 cells, to nucleic acidamplification using, e.g., the polymerase chain reaction (PCR) with thefollowing primers:

[0394] amplification control element forward primer (1-30)

[0395] 5′-GAGGAATTCCCCATCCCTAATCCAGATTGGTG-3′ (SEQ ID NO. 25)

[0396] amplification control element reverse primer (2142-2111)

[0397] 5′-AAACTGCAGGCCGAGCCACCTCTCTTCTGTGTTTG-3′ (SEQ ID NO. 26)

[0398] origin of replication region forward primer (2116-2141)

[0399] 5′-AGGAATTCACAGAAGAGAGGTGGCTCGGCCTGC-3′ (SEQ ID NO. 27)

[0400] origin of replication region reverse primer (5546-5521)

[0401] 5′-AGCCTGCAGGAAGTCATACCTGGGGAGGTGGCCC-3′ (SEQ ID NO. 28)

[0402] C. Summary of the formation of the megachromosome

[0403]FIG. 2 schematically sets forth events leading to the formation ofa stable megachromosome beginning with the generation of a dicentricchromosome in a mouse LMTK⁻ cell line: (A) A single E-type amplificationin the centromeric region of the mouse chromosome 7 followingtransfection of LMTK⁻ cells with λCM8 and λgtWESneo generates theneo-centromere linked to the integrated foreign DNA, and forms adicentric chromosome. Multiple E-type amplification forms theλneo-chromosome, which was derived from chromosome 7 and stabilized in amouse-hamster hybrid cell line; (B) Specific breakage between thecentromeres of a dicentric chromosome 7 generates a chromosome fragmentwith the neo-centromere, and a chromosome 7 with traces of foreign DNAat the end; (C) Inverted duplication of the fragment bearing theneo-centromere results in the formation of a stable neo-minichromosome;(D) Integration of exogenous DNA into the foreign DNA region of theformerly dicentric chromosome 7 initiates H-type amplification, and theformation of a heterochromatic arm. By capturing a euchromatic terminalsegment, this new chromosome arm is stabilized in the form of the“sausage” chromosome; (E) BrdU treatment and/or drug selection appearsto induce further H-type amplification, which results in the formationof an unstable gigachromosome: (F) Repeated BrdU treatments and/or drugselection induce further H-type amplification including a centromereduplication, which leads to the formation of another heterochromaticchromosome arm. It is split off from the chromosome 7 by chromosomebreakage and acquires a terminal segment to form the stablemegachromosome.

[0404] D. Expression of β-galactosidase and hygromycin transferase genesin cell lines carrying the megachromosome or derivatives thereof

[0405] The level of heterologous gene (i.e., β-galactosidase andhygromycin transferase genes) expression in cell lines containing themegachromosome or a derivative thereof was quantitatively measured. Therelationship between the copy-number of the heterologous genes and thelevel of protein expressed therefrom was also determined.

[0406] 1. Materials and methods

[0407] a. Cell lines

[0408] Heterologous gene expression levels of H1D3 cells, carrying a250-400 Mb megachromosome as described above, and mM2C1 cells, carryinga 50-60 Mb micro-megachromosome, were quantitatively evaluated. mM2C1cells were generated by repeated BrdU treatment and single cell cloningof the H1xHe41 cell line (mouse-hamster-human hybrid cell line carryingthe megachromosome and a single human chromosome with CD4 and neo^(r)genes; see FIG. 4). The cell lines were grown under standard conditionsin F12 medium under selective (120 μg/ml hygromycin) or non-selectiveconditions.

[0409] b. Preparation of cell extract for β-galactosidase assays

[0410] Monolayers of mM2C1 or H1D3 cell cultures were washed three timeswith phosphate-buffered saline (PBS). Cells were scraped by rubberpolicemen and suspended and washed again in PBS. Washed cells wereresuspended into 0.25 M Tris-HCl, pH 7.8, and disrupted by three cyclesof freezing in liquid nitrogen and thawing at 37° C. The extract wasclarified by centrifugation at 12,000 rpm for 5 min. at 4° C.

[0411] c. β-galactosidase assay

[0412] The β-galactosidase assay mixture contained 1 mM MgCl₂, 45 mMβ-mercaptoethanol, 0.8 mg/ml o-nitrophenyl-β-D-galactopyranoside and 66mM sodium phosphate, pH 7.5. After incubating the reaction mixture withthe cell extract at 37° C. for increasing time, the reaction wasterminated by the addition of three volumes of 1 M Na₂CO₃, and theoptical density was measured at 420 nm. Assay mixture incubated withoutcell extract was used as a control. The linear range of the reaction wasdetermined to be between 0.1-0.8 OD₄₂₀. One unit of β-galactosidaseactivity is defined as the amount of enzyme that will hydrolyse 3 nmolesof o-nitrophenyl-β-D-galactopyranoside in 1 minute at 37° C.

[0413] d. Preparation of cell extract for hygromycin phosphotransferaseassay

[0414] Cells were washed as described above and resuspended into 20 mMHepes buffer, pH 7.3, 100 mM potassium acetate, 5 mM Mg acetate and 2 mMdithiothreitol). Cells were disrupted at 0° C. by six 10 sec bursts inan MSE ultrasonic disintegrator using a microtip probe. Cells wereallowed to cool for 1 min after each ultrasonic burst. The extracts wereclarified by centrifuging for 1 min at 2000 rpm in a microcentrifuge.

[0415] e. Hygromycin phosphotransferase assay

[0416] Enzyme activity was measured by means of the phosphocellulosepaper binding assay as described by Haas and Dowding [(1975). Meth.Enzymol. 43:611-628]. The cell extract was supplemented with 0.1 Mammonium chloride and 1 mM adenosine-β-³²P-triphosphate (specificactivity: 300 Ci/mmol). The reaction was initiated by the addition of0.1 mg/ml hygromycin and incubated for increasing time at 37° C. Thereaction was terminated by heating the samples for 5 min at 75° C. in awater bath, and after removing the precipitated proteins bycentrifugation for 5 min in a microcentrifuge, an aliquot of thesupernatant was spotted on a piece of Whatman P-81 phosphocellulosepaper (2 cm²). After 30 sec at room temperature the papers are placedinto 500 ml of hot (75° C.) distilled water for 3 min. While theradioactive ATP remains in solution under these conditions, hygromycinphosphate binds strongly and quantitatively to phosphocellulose. Thepapers are rinsed 3 times in 500 ml of distilled water and the boundradioactivity was measured in toluene scintillation cocktail in aBeckman liquid scintillation counter. Reaction mixture incubated withoutadded hygromycin served as a control.

[0417] f. Determination of the copy-number of the heterologous genes

[0418] DNA was prepared from the H1D3 and mM2C1 cells using standardpurification protocols involving SDS lysis of the cells followed byProteinase K treatment and phenol/chloroform extractions. The isolatedDNA was digested with an appropriate restriction endonuclease,fractionated on agarose gels, blotted to nylon filters and hybridizedwith a radioactive probe derived either from the β-galactosidase or thehygromycin phosphotransferase genes. The level of hybridization wasquantified in a Molecular Dynamics PhosphorImage Analyzer. To controlthe total amount of DNA loaded from the different cells lines, thefilters were reprobed with a single copy gene, and the hybridization ofβ-galactosidase and hygromycin phosphotransferase genes was normalizedto the single copy gene hybridization.

[0419] g. Determination of protein concentration

[0420] The total protein content of the cell extracts was measured bythe Bradford colorimetric assay using bovine serum albumin as standard.

[0421] 2. Characterization of the β-galactosidase and hygromycinphsophotransferase activity expressed in H1D3 and mM2C1 cells

[0422] In order to establish quantitative conditions, the most importantkinetic parameters of β-galactosidase and hygromycin phosphotransferaseactivity have been studied. The β-galactosidase activity measured with acolorimetric assay was linear between the 0.1-0.8 OD₄₂₀ range both forthe nM2C1 and H1D3 cell lines. The β-galactosidase activity was alsoproportional in both cell lines with the amount of protein added to thereaction mixture within 5-100 μg total protein concentration range. Thehygromycin phosphotransferase activity of nM2C1 and H1D3 cell lines wasalso proportional with the reaction time or the total amount of addedcell extract under the conditions described for the β-galactosidase.

[0423] a. Comparison of β-galactosidase activity of mM2C1 and H1D3 celllines

[0424] Cell extracts prepared from logarithmically growing mM2CI andH1D3 cell lines were tested for β-galactosidase activity, and thespecific activities were compared in 10 independent experiments. Theβ-galactosidase activity of H1D3 cell extracts was 440±25 U/mg totalprotein. Under identical conditions the β-galactosidase activity of themM2C1 cell extracts was 4.8 times lower: 92±13 U/mg total protein.

[0425] β-galactosidase activities of highly subconfluent, subconfluentand nearly confluent cultures of H1D3 and mM2C1 cell lines were alsocompared. In these experiments different numbers of logarithmic H1D3 andmM2C1 cells were seeded in constant volume of culture medium and grownfor 3 days under standard conditions. No significant difference wasfound in the β-galactosidase specific activities of cell cultures grownat different cell densities, and the ratio of H1D3/mM2C1 β-galactosidasespecific activities was also similar for all three cell densities. Inconfluent, stationary cell cultures of H1D3 or mM2C1 cells, however, theexpression of β-galactosidase significantly decreased due likely tocessation of cell division as a result of contact inhibition.

[0426] b. Comparison of hygromycin phosphotransferase activity of H1D3and mM2C1 cell lines

[0427] The bacterial hygromycin phosphotransferase is present in amembrane-bound form in H1D3 or mM2C1 cell lines. This follows from theobservation that the hygromycin phosphotransferase activity can becompletely removed by high speed centrifugation of these cell extracts,and the enzyme activity can be recovered by resuspending the high speedpellet.

[0428] The ratio of the enzyme's specific activity in H1D3 and mM2C1cell lines was similar to that of β-galactosidase activity, i.e., H1D3cells have 4.1 times higher specific activity compared with mM2C1 cells.

[0429] c. Hygromycin phosphotransferase activity in H1D3 and mM2C1 cellsgrown under non-selective conditions

[0430] The level of expression of the hygromycin phosphotransferase genewas measured on the basis of quantitation of the specific enzymeactivities in H1D3 and mM2C1 cell lines grown under non-selectiveconditions for 30 generations. The absence of hygromycin in the mediumdid not influence the expression of the hygromycin phosphotransferasegene.

[0431] 3. Quantitation of the number of β-galactosidase and hygromycinphosphotransferase gene copies in H1D3 and mM2C1 cell lines

[0432] As described above, the β-galactosidase and hygromycinphosphotransferase genes are located only within the megachromosome, ormicro-megachromosome in H1D3 and mM2C1 cells. Quantitative analysis ofgenomic Southern blots of DNA isolated from H1D3 and mM2C1 cell lineswith the PhosphorImage Analyzer revealed that the copy number ofβ-galactosidase genes integrated into the megachromosome isapproximately 10 times higher in H1D3 cells than in mM2C1 cells. Thecopy-number of hygromycin phosphotransferase genes is approximately 7times higher in H1D3 cells than in mM2C1 cells.

[0433] 4. Summary and conclusions of results of quantitation ofheterologous gene expression in cells containing megachromosomes orderivatives thereof

[0434] Quantitative determination of β-galactosidase activity of highereukaryotic cells (eq., H1D3 cells) carrying the bacterialβ-galactosidase gene in heterochromatic megachromosomes confirmed theobserved high-level expression of the integrated bacterial gene detectedby cytological staining methods. It has generally been established inreports of studies of the expression of foreign genes in transgenicanimals that, although transgene expression shows correct tissue anddevelopmental specificity, the level of expression is typically low andshows extensive position-dependent variability (i.e., the level oftransgene expression depends on the site of chromosomal integration). Itis has been assumed that the low-level transgene expression may be dueto the absence of special DNA sequences which can insulate the transgenefrom the inhibitory effect of the surrounding chromatin and promote theformation of active chromatin structure required for efficient geneexpression. Several cis-activating DNA sequence elements have beenidentified that abolish this position-dependent variability, and canensure high-level expression of the transgene locus activating region(LAR) sequences in higher eukaryotes and specific chromatin structure(scs) elements in lower eukaryotes (see, et al. Eissenberg and Elgin(1991) Trends in Genet. 7:335-340). If these cis-acting DNA sequencesare absent, the level of transgene expression is low and copy-numberindependent.

[0435] Although the bacterial β-galactosidase reporter gene contained inthe heterochromatic megachromosomes of H1D3 and mM2C1 cells is driven bya potent eukaryotic promoter-enhancer element, no specific cis-actingDNA sequence element was designed and incorporated into the bacterialDNA construct which could function as a boundary element. Thus, thehigh-level β-galactosidase expression measured in these cells is ofsignificance, particularly because the β-galactosidase gene in themegachromosome is located in a long, compact heterochromaticenvironment, which is known to be able to block gene expression. Themegachromosome appears to contain DNA sequence element(s) in associationwith the bacterial DNA sequences that function to override theinhibitory effect of heterochromatin on gene expression.

[0436] The specificity of the heterologous gene expression in themegachromosome is further supported by the observation that the level ofβ-galactosidase expression is copy-number dependent. In the H1D3 cellline, which carries a full-size megachromosome, the specific activity ofβ-galactosidase is about 5-fold higher than in mM2C1 cells, which carryonly a smaller, truncated version of the megachromosome. A comparison ofthe number of β-galactosidase gene copies in H1D3 and mM2C1 cell linesby quantitative hybridization techniques confirmed that the expressionof β-galactosidase is copy-number dependent. The number of integratedβ-galactosidase gene copies is approximately 10-fold higher in the H1D3cells than in mM2C1 cells. Thus, the cell line containing the greaternumber of copies of the β-galactosidase gene also yields higher levelsof β-galactosidase activity, which supports the copy-number dependencyof expression. The copy number dependency of the β-galactosidase andhygromycin phosphotransferase enzyme levels in cell lines carryingdifferent derivatives of the megachromosome indicates that neither thechromatin organization surrounding the site of integration of thebacterial genes, nor the heterochromatic environment of themegachromosome suppresses the expression of the genes.

[0437] The relative amount of β-galactosidase protein expressed in H1D3cells can be estimated based on the V_(max) of this enzyme [500 forhomogeneous, crystallized bacterial β-galactosidase (Naider et al.(1972) Biochemistry 11:3202-3210)] and the specific activity of H1D3cell protein. A V_(max) of 500 means that the homogeneousβ-galactosidase protein hydrolyzes 500 μmoles of substrate per minuteper mg of enzyme protein at 37° C. One mg of total H1D3 cell proteinextract can hydrolyze 1.4 μmoles of substrate per minute at 37° C.,which means that 0.28% of the protein present in the H1D3 cell extractis β-galactosidase.

[0438] The hygromycin phosphotransferase is present in a membrane-boundform in H1D3 and mM2C1 cells. The tendency of the enzyme to integrateinto membranes in higher eukaryotic cells may be related to itsperiplasmic localization in prokaryotic cells. The bacterial hygromycinphosphotransferase has not been purified to homogeneity; thus, itsV_(max) has not been determined. Therefore, no estimate can be made onthe total amount of hygromycin phosphotransferase protein expressed inthese cell lines. The 4-fold higher specific activity of hygromycinphosphotransferase in H1D3 cells as compared to mM2C1 cells, however,indicates that its expression is also copy number dependent.

[0439] The constant and high level expression of the β-galactosidasegene in H1D3 and mM2C1 cells, particularly in the absence of anyselective pressure for the expression of this gene, clearly indicatesthe stability of the expression of genes carried in the heterochromaticmegachromosomes. This conclusion is further supported by the observationthat the level of hygromycin phosphotransferase expression did notchange when H1D3 and mM2C1 cells were grown under non-selectiveconditions. The consistent high-level, stable, and copy-number dependentexpression of bacterial marker genes clearly indicates that themegachromosome is an ideal vector system for expression of foreigngenes.

EXAMPLE 7

[0440] Summary of Some of the Cell Lines with SATACS and MinichromosomesThat Have Been Constructed

[0441] 1. EC3/7-Derived cell lines

[0442] The LMTK⁻-derived cell line, which is a mouse fibroblast cellline, was transfected with λCM8 and λgtWESneo DNA [see, EXAMPLE 2] toproduce transformed cell lines. Among these, was EC3/7, deposited at theEuropean Collection of Animal cell Culture (ECACC) under Accession No.90051001 [see, U.S. Pat. No. 5,288,625; see, also Hadlaczky et al.(1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-8110 and U.S. applicationSer. No. 08/375,271]. This cell line contains the dicentric chromosomewith the neo-centromere. Recloning and selection produced cell linessuch as EC3/7C5, which are cell lines with the stable neo-minichromosomeand the formerly dicentric chromosome [see, FIG. 2C].

[0443] 2. KE1-2/4 Cells

[0444] Fusion of EC3/7 with CHO-K20 cells and selection with G418/HATproduced hybrid cell lines, among these was KE1-2/4, which has beendeposited with the ECACC under Accession No. 96040924. KE1-2/4 is astable cell line that contains the λneo-chromosome [see, FIG. 2D; see,also U.S. Pat. No. 5,288,625], produced by E-type amplifications.KE1-2/4 has been transfected with vectors containing λ DNA, selectablemarkers, such as the puromycin-resistance gene, and genes of interest,such as p53 and the anti-HIV ribozyme gene. These vectors target thegene of interest into the λneo-chromosome by virtue of homologousrecombination with the heterologous DNA in the chromosome.

[0445] 3. C5pMCT53 Cells

[0446] The EC3/7C5 cell line has been co-transfected with pH132, pCH110and λ DNA [see, EXAMPLE 2] as well as other constructs. Various clonesand subclones have been selected. For example transformation with aconstruct that includes p53 encoding DNA, produced cells designatedC5pMCT53.

[0447] 4. TF1004G24 Cells

[0448] As discussed above, cotransfection of EC3/7C5 cells with plasmids[pH 132, pCH110 available from Pharmacia, see, also Hall et al. (1983)J. Mol. Appl. Gen. 2:101-109] and with λ DNA [λcl 857 Sam 7 (New EnglandBiolabs)] produced transformed cells. Among these is TF1004G24, whichcontains the DNA encoding the anti-HIV ribozyme in theneo-mini-chromosome. Recloning of TF1004G24 produced numerous celllines. Among these is the NHHL24 cell line. This cell line also has theanti-HIV ribozyme in the neo-minichromosome and expresses high levels ofβ-gal. It has been fused with CHO-K20 cells to produce various hybrids.

[0449] 5. TF1004G19-Derived cells

[0450] Recloning and selection of the TF1004G transformants produced thecell line TF1004G19, discussed above in EXAMPLE 4, which contains theunstable sausage chromosome and the neo-minichromosome. Single cellcloning produced the TF1004G-19C5 [see FIG. 4] cell line, which has astable sausage chromosome and the neo-minichromosome. TF1004G-19C5 hasbeen fused with CHO cells and the hybrids grown under selectiveconditions to produce the 19C5xHa4 and 19C5xHa3 cell lines [see, EXAMPLE4] and others. Recloning of the 19C5xHa3 cell line yielded a cell linecontaining a gigachromosome, i.e., cell line 19C5xHa47, see FIG. 2E.BrdU treatment of 19C5xHa4 cells and growth under selective conditions[neomycin (G) and/or hygromycin (H)] has produced hybrid cell lines suchas the G3D5 and G4D6 cell lines and others. G3D5 has theneo-minichromosome and the megachromosome. G4D6 has only theneo-minichromosome.

[0451] Recloning of 19C5xHa4 cells in H medium produced numerous clones.Among these is H1D3 [see FIG. 4], which has the stable megachromosome.Repeated BrdU treatment and recloning of H1D3 cells has produced theHB31 cell line, which has been used for transformations with thepTEMPUD, pTEMPU, pTEMPU3, and pCEPUR-132 vectors [see, Examples 12 and14, below].

[0452] H1D3 has been fused with a CD4⁺ Hela cell line that carries DNAencoding CD4 and neomycin resistance on a plasmid [see, e.g., U.S. Pat.Nos. 5,413,914, 5,409,810, 5,266,600, 5,223,263, 5,215,914 and5,144,019, which describe these Hela cells]. Selection with GH hasproduced hybrids, including H1xHE41 [see FIG. 4], which carries themegachromosome and also a single human chromosome that includes theCD4neo construct. Repeated BrdU treatment and single cell cloning hasproduced cell lines with the megachromosome [cell line 1B3, see FIG. 4].About 25% of the 1B3 cells have a truncated megachromosome [˜90-120 Mb].Another of these subclones, designated 2C5, was cultured onhygromycin-containing medium and megachromosome-free cell lines wereobtained and grown in G418-containing medium. Recloning of these cellsyielded cell lines such as IB4 and others that have a dwarfmegachromosome [˜150-200 Mb], and cell lines, such as 11C3 and mM2C1,which have a micro-megachromosome [˜50-90 Mb]. The micro-megachromosomeof cell line mM2C1 has no telomeres; however, if desired, synthetictelomeres, such as those described and generated herein, may be added tothe mM2C1 cell micro-megachromosomes. Cell lines containing smallertruncated megachromosomes, such as the mM2C1 cell line containing themicro-megachromosome, can be used to generate even smallermegachromosomes, e.g., ˜10-30 Mb in size. This may be accomplished, forexample, by breakage and fragmentation of the micro-megachromosome inthese cells through exposing the cells to X-ray irradiation, BrdU ortelomere-directed in vivo chromosome fragmentation.

EXAMPLE 8

[0453] Replication of the Megachromosome

[0454] The homogeneous architecture of the megachromomes provides aunique opportunity to perform a detailed analysis of the replication ofthe constitutive heterochromatin.

[0455] A. Materials and methods

[0456] 1. Culture of cell lines

[0457] H1D3 mouse-hamster hybrid cells carrying the megachromosome [see,EXAMPLE 4] were cultured in F-12 medium containing 10% fetal calf serum[FCS] and 400 μg/ml Hygromycin B [Calbiochem]. G3D5 hybrid cells [see,Example 4] were maintained in F-12 medium containing 10% FCS, 400 μg/mlHygromycin B (Calbiochem), and 400 μg/ml G418 [SIGMA]. Mouse A9fibroblast cells were cultured in F-12 medium supplemented with 10% FCS.

[0458] 2. BrdU labelling

[0459] In typical experiments, 20-24 parallel semi-confluent cellcultures were set up in 10 cm Petri dishes. Bromodeoxyuridine (BrdU)(Fluka) was dissolved in distilled water alkalized with a drop of NaOH,to make a 10⁻² M stock solution. Aliquots of 10-50 μl of this BrdU stocksolution were added to each 10 ml culture, to give a final BrdUconcentration of 10-50 μM. The cells were cultured in the presence ofBrdU for 30 min, and then washed with warm complete medium, andincubated without BrdU until required. At this point, 5 μg/ml colchicinewas added to a sample culture every 1 or 2 h. After 1-2 h colchicinetreatment, mitotic cells were collected by “shake-off” and regularchromosome preparations were made for immunolabelling.

[0460] 3. Immunolabelling of chromosomes and in situ hybridization

[0461] Immunolabelling with fluorescein-conjugated anti-BrdU monoclonalantibody (Boehringer) was done according to the manufacturer'srecommendations, except that for mouse A9 chromosomes, 2 M hydrochloricacid was used at 37° C. for 25 min, while for chromosomes of hybridcells, 1 M hydrochloric acid was used at 37° C. for 30 min. In situhybridization with biotin-labelled probes, and indirectimmunofluorescence and in situ hybridization on the same preparation,were performed as described previously [Hadlaczky et al. (1991) Proc.Natl. Acad. Sci. U.S.A. 88:8106-8110, see, also U.S. Pat. No.5,288,625].

[0462] 4. Microscopy

[0463] All observations and microphotography were made by using a VanoxAHBS (Olympus) microscope. Fujicolor 400 Super G or Fujicolor 1600 SuperHG high-speed color negatives were used for photographs.

[0464] B. Results

[0465] The replication of the megachromosome was analyzed by BrdU pulselabelling followed by immunolabelling. The basic parameters for DNAlabelling in vivo were first established. Using a 30-min pulse of 50 μMBrdU in parallel cultures, samples were taken and fixed at 5 minintervals from the beginning of the pulse, and every 15 min up to 1 hafter the removal of BrdU. Incorporated BrdU was detected byimmunolabelling with fluorescein-conjugated anti-BrdU monoclonalantibody. At the first time point (5 min) 38% of the nuclei werelabelled, and a gradual increase in the number of labelled nuclei wasobserved during incubation in the presence of BrdU, culminating in 46%in the 30-min sample, at the time of the removal of BrdU. At furthertime points (60, 75, and 90 min) no significant changes were observed,and the fraction of labelled nuclei remained constant [44.5-46%].

[0466] These results indicate that (i) the incorporation of the BrdU isa rapid process, (ii) the 30 min pulse-time is sufficient for reliablelabelling of S-phase nuclei, and (iii) the BrdU can be effectivelyremoved from the cultures by washing.

[0467] The length of the cell cycle of the H1D3 and G3D5 cells wasestimated by measuring the time between the appearance of the earliestBrdU signals on the extreme late replicating chromosome segments and theappearance of the same pattern only on one of the chromatids of thechromosomes after one completed cell cycle. The length of G2 period wasdetermined by the time of the first detectable BrdU signal on prophasechromosomes and by the labelled mitoses method [Qastler et al. (1959)Exp. Cell Res. 17:420-438]. The length of the S-phase was determined inthree ways: (i) on the basis of the length of cell cycle and thefraction of nuclei labelled during the 30-120 min pulse; (ii) bymeasuring the time between the very end of the replication of theextreme late replicating chromosomes and the detection of the firstsignal on the chromosomes at the beginning of S phase; (iii) by thelabelled mitoses method. In repeated experiments, the duration of thecell cycle was found to be 22-26 h, the S phase 10-14 h, and the G2phase 3.5-4.5 h.

[0468] Analyses of the replication of the megachromosome were made inparallel cultures by collecting mitotic cells at two hour intervalsfollowing two hours of colchicine treatment. In a repeat experiment, thesame analysis was performed using one hour sample intervals and one hourcolchicine treatment. Although the two procedures gave comparableresults, the two hour sample intervals were viewed as more appropriatesince approximately 30% of the cells were found to have a considerablyshorter or longer cell cycle than the average. The characteristicreplication patterns of the individual chromosomes, especially some ofthe late replicating hamster chromosomes, served as useful internalmarkers for the different stages of S-phase. To minimize the errorcaused by the different lengths of cell cycles in the differentexperiments, samples were taken and analyzed throughout the whole cellcycle until the appearance of the first signals on one chromatid at thebeginning of the second S-phase.

[0469] The sequence of replication in the megachromosome is as follows.At the very beginning of the S-phase, the replication of themegachromosome starts at the ends of the chromosomes. The firstinitiation of replication in an interstitial position can usually bedetected at the centromeric region. Soon after, but still in the firstquarter of the S-phase, when the terminal region of the short arm hasalmost completed its replication, discrete initiation signals appearalong the chromosome arms. In the second quarter of the S-phase, asreplication proceeds, the BrdU-labelled zones gradually widen, and thecheckered pattern of the megachromosome becomes clear [see, e.g., FIG.2F]. At the same time, pericentric regions of mouse chromosomes alsoshow intense incorporation of BrdU. The replication of themegachromosome peaks at the end of the second quarter and in the thirdquarter of the S-phase. At the end of the third quarter, and at the verybeginning of the last quarter of the S-phase, the megachromosome and thepericentric heterochromatin of the mouse chromosomes complete theirreplication. By the end of S-phase, only the very late replicatingsegments of mouse and hamster chromosomes are still incorporating BrdU.

[0470] The replication of the whole genome occurs in distinct phases.The signal of incorporated BrdU increased continuously until the end ofthe first half of the S-phase, but at the beginning of the third quarterof the S-phase chromosome segments other than the heterochromaticregions hardly incorporated BrdU. In the last quarter of the S-phase,the BrdU signals increased again when the extreme late replicatingsegments showed very intense incorporation.

[0471] Similar analyses of the replication in mouse A9 cells wereperformed as controls. To increase the resolution of the immunolabellingpattern, pericentric regions of A9 chromosomes were decondensed bytreatment with Hoechst 33258. Because of the intense replication of thesurrounding euchromatic sequences, precise localization of the initialBrdU signal in the heterochromatin was normally difficult, even onundercondensed mouse chromosomes. On those chromosomes where theinitiation signal(s) were localized unambiguously, the replication ofthe pericentric heterochromatin of A9 chromosomes was similar to that ofthe megachromosome. Chromosomes of A9 cells also exhibited replicationpatterns and sequences similar to those of the mouse chromosomes in thehybrid cells. These results indicate that the replicators of themegachromosome and mouse chromosomes retained their original timing andspecificity in the hybrid cells.

[0472] By comparing the pattern of the initiation sites obtained afterBrdU incorporation with the location of the integration sites of the“foreign” DNA in a detailed analysis of the first quarter of theS-phase, an attempt was made to identify origins of replication(initiation sites) in relation to the amplicon structure of themegachromosome. The double band of integrated DNA on the long arm of themegachromosome served as a cytological marker. The results showed acolocalization of the BrdU and in situ hybridization signals found atthe cytological level, indicating that the “foreign” DNA sequences arein close proximity to the origins of replication, presumably integratedinto the non-satellite sequences between the replicator and thesatellite sequences [see, FIG. 3]. As described in Example 6.B.4, therDNA sequences detected in the megachromosome are also localized at theamplicon borders at the site of integration of the “foreign” DNAsequences, suggesting that the origins of replication responsible forinitiation of replication of the megachromosome involve rDNA sequences.In the pericentric region of several other chromosomes, dot-like BrdUsignals can also be observed that are comparable to the initiationsignals on the megachromosome. These signals may represent similarinitiation sites in the heterochromatic regions of normal chromosomes.

[0473] At a frequency of 10⁻⁴, “uncontrolled” amplification of theintegrated DNA sequences was observed in the megachromosome. Consistentwith the assumption (above) that “foreign” sequences are in proximity ofthe replicators, this spatially restricted amplification is likely to bea consequence of uncontrolled repeated firings of the replicationorigin(s) without completing the replication of the whole segment.

[0474] C. Discussion

[0475] It has generally been thought that the constitutiveheterochromatin of the pericentric regions of chromosomes is latereplicating [see, e.g., Miller (1976) Chromosoma 55:165-170]. On thecontrary, these experiments evidence that the replication of theheterochromatic blocks starts at a discrete initiation site in the firsthalf of the S-phase and continues through approximately three-quartersof S-phase. This difference can be explained in the following ways: (i)in normal chromosomes, actively replicating euchromatic sequences thatsurround the satellite DNA obscure the initiation signals, and thus theprecise localization of initiation sites is obscured; (ii) replicationof the heterochromatin can only be detected unambiguously in a periodduring the second half of the S-phase, when the bulk of theheterochromatin replicates and most other chromosomal regions havealready completed their replication, or have not yet started it. Thus,low resolution cytological techniques, such as analysis of incorporationof radioactively labelled precursors by autoradiography, only detectprominent replication signals in the heterochromatin in the second halfof S-phase, when adjacent euchromatic segments are no longerreplicating.

[0476] In the megachromosome, the primary initiation sites ofreplication colocalize with the sites where the “foreign” DNA sequencesand rDNA sequences are integrated at the amplicon borders. Similarinitiation signals were observed at the same time in the pericentricheterochromatin of some of the mouse chromosomes that do not have“foreign” DNA, indicating that the replication initiation sites at theborders of amplicons may reside in the non-satellite flanking sequencesof the satellite DNA blocks. The presence of a primary initiation siteat each satellite DNA doublet implies that this large chromosome segmentis a single huge unit of replication [megareplicon] delimited by theprimary initiation site and the termination point at each end of theunit. Several lines of evidence indicate that, within this higher-orderreplication unit, “secondary” origins and replicons contribute to thecomplete replication of the megareplicon:

[0477] 1. The total replication time of the heterochromatic regions ofthe megachromosome was ˜9-11 h. At the rate of movement of replicationforks, 0.5-5 kb per minute, that is typical of eukaryotic chromosomes[Kornberg et al. (1992) DNA Replication. 2nd. ed.., New York: W.H.Freeman and Co, p. 474], replication of a ˜15 Mb replicon would require50-500 h. Alternatively, if only a single replication origin was used,the average replication speed would have to be 25 kb per minute tocomplete replication within 10 h. By comparing the intensity of the BrdUsignals on the euchromatic and the heterochromatic chromosome segments,no evidence for a 5- to 50-fold difference in their replication speedwas found.

[0478] 2. Using short BrdU pulse labelling, a single origin ofreplication would produce a replication band that moves along thereplicon, reflecting the movement of the replication fork. In contrast,a widening of the replication zone that finally gave rise to thecheckered pattern of the megachromosome was observed, and within thereplication period, the most intensive BrdU incorporation occurred inthe second half of the S-phase. This suggests that once themegareplicator has been activated, it permits the activation and firingof “secondary” origins, and that the replication of the bulk of thesatellite DNA takes place from these “secondary” origins during thesecond half of the S-phase. This is supported by the observation that incertain stages of the replication of the megachromosome, the wholeamplicon can apparently be labelled by a short BrdU pulse.

[0479] Megareplicators and secondary replication origins seem to beunder strict temporal and spatial control. The first initiation withinthe megachromosomes usually occurred at the centromere, and shortlyafterward all the megareplicators become active. The last segment of themegachromosome to complete replication was usually the second segment ofthe long arm. Results of control experiments with mouse A9 chromosomesindicate that replication of the heterochromatin of mouse chromosomescorresponds to the replication of the megachromosome amplicons.Therefore, the pre-existing temporal control of replication in theheterochromatic blocks is preserved in the megachromosome. Positive[Hassan et al. (1994) J. Cell. Sci. 107:425-434] and negative [Haase etal. (1994) Mol. Cell. Biol. 14:2516-2524] correlations betweentranscriptional activity and initiation of replication have beenproposed. In the megachromosome, transcription of the integrated genesseems to have no effect on the original timing of the replicationorigins. The concerted, precise timing of the megareplicator initiationsin the different amplicons suggests the presence of specific, cis-actingsequences, origins of replication.

[0480] Considering that pericentric heterochromatin of mouse chromosomescontains thousands of short, simple repeats spanning 7-15 Mb, and thecentromere itself may also contain hundreds of kilobases, the existenceof a higher-order unit of replication seems probable. The observeduncontrolled intrachromosomal amplification restricted to a replicationinitiation region of the megachromosome is highly suggestive of arolling-circle type amplification, and provides additional evidence forthe presence of a replication origin in this region.

[0481] The finding that a specific replication initiation site occurs atthe boundaries of amplicons suggests that replication might play a rolein the amplification process. These results suggest that each ampliconof the megachromosome can be regarded as a huge megareplicon defined bya primary initiation site [megareplicator] containing “secondary”origins of replication. Fusion of replication bubbles from differentorigins of bi-directional replication [DePamphilis (1993) Ann. Rev.Biochem. 62:29-63] within the megareplicon could form a giantreplication bubble, which would correspond to the whole megareplicon. Inthe light of this, the formation of megabase-size amplicons can beaccommodated by a replication-directed amplification mechanism. In H andE-type amplifications, intrachromosomal multiplication of the ampliconswas observed [see, above EXAMPLES], which is consistent with the unequalsister chromatid exchange model. Induced or spontaneous unscheduledreplication of a megareplicon in the constitutive heterochromatin mayalso form new amplicon(s) leading to the expansion of the amplificationor to the heterochromatic polymorphism of “normal” chromosomes. The“restoration” of the missing segment on the long arm of themegachromosome may well be the result of the re-replication of oneamplicon limited to one strand.

[0482] Taken together, without being bound by any theory, areplication-directed mechanism is a plausible explanation for theinitiation of large-scale amplifications in the centromeric regions ofmouse chromosomes, as well as for the de novo chromosome formations. Ifspecific [amplificator, i.e., sequences controlling amplification]sequences play a role in promoting the amplification process, sequencesat the primary replication initiation site [megareplicator] of themegareplicon are possible candidates.

[0483] The presence of rRNA gene sequence at the amplicon borders nearthe foreign DNA in the megachromosome suggests that this sequencecontributes to the primary replication initiation site and participatesin large-scale amplification of the pericentric heterochromatin in denovo formation of SATACs. Ribosomal RNA genes have an intrinsicamplification mechanism that provides for multiple copies of tandemgenes. Thus, for purposes herein, in the construction of SATACs incells, rDNA will serve as a region for targeted integration, and ascomponents of SATACs constructed in vitro.

EXAMPLE 9

[0484] Generation of Chromosomes with Amplified Regions Derived fromMouse Chromosome 1

[0485] To show that the events described in EXAMPLES 2-7 are not uniqueto mouse chromosome 7 and to show that the EC7/3 cell line is notrequired for formation of the artificial chromosomes, the experimentshave been repeated using different initial cell lines and DNA fragments.Any cell or cell line should be amenable to use or can readily bedetermined that it is not.

[0486] A. Materials

[0487] The LP11 cell line was produced by the “scrape-loading”transfection method [Fechheimer et al. (1987) Proc. Natl. Acad. Sci.U.S.A. 84:8463-8467] using 25 μg plasmid DNA for 5×10⁶ recipient cells.LP11 cells were maintained in F-12 medium containing 3-15 μg/miPuromycin [SIGMA].

[0488] B. Amplification in LP11 cells

[0489] The large-scale amplification described in the above Examples isnot restricted to the transformed EC3/7 cell line or to the chromosome 7of mouse. In an independent transformation experiment, LMTK- cells weretransfected using the calcium phosphate precipitation procedure with aselectable puromycin-resistance gene-containing construct designatedpPuroTel [see Example 1.E.2. for a description of this plasmid], toestablish cell line LP11. Cell line LP11 carries chromosome(s) withamplified chromosome segments of different lengths [˜150-600 Mb].Cytological analysis of the LP11 cells indicated that the amplificationoccurred in the pericentric region of the long arm of a submetacentricchromosome formed by Robertsonian translocation. This chromosome arm wasidentified by G-banding as chromosome 1. C-banding and in situhybridization with mouse major satellite DNA probe showed that an E-typeamplification had occurred: the newly formed region was composed of anarray of euchromatic chromosome segments containing different amounts ofheterochromatin. The size and C-band pattern of the amplified segmentswere heterogeneous. In several cells, the number of these amplifiedunits exceeded 50; single-cell subclones of LP11 cell lines, however,carry stable marker chromosomes with 10-15 segments and constant C-bandpatterns.

[0490] Sublines of the thymidine kinase-deficient LP11 cells (e.g.,LP11-15P1C5/7 cell line) established by single-cell cloning of LP11cells were transfected with a thymidine kinase gene construct. StableTK⁺ transfectants were established.

EXAMPLE 10

[0491] Isolation of SATACS and Other Chromosomes with Atypical BaseContent and/or Size

[0492] I. Isolation of artificial chromosomes from endogenouschromosomes

[0493] Artificial chromosomes, such as SATACs, may be sorted fromendogenous chromosomes using any suitable procedures, and typicallyinvolve isolating metaphase chromosomes, distinguishing the artificialchromosomes from the endogenous chromosomes, and separating theartificial chromosomes from endogenous chromosomes. Such procedures willgenerally include the following basic steps: (1) culture of a sufficientnumber of cells (typically about 2×10⁷ mitotic cells) to yield,preferably on the order of 1×10⁶ artificial chromosomes, (2) arrest ofthe cell cycle of the cells in a stage of mitosis, preferrablymetaphase, using a mitotic arrest agent such as colchicine, (3)treatment of the cells, particularly by swelling of the cells inhypotonic buffer, to increase susceptibility of the cells to disruption,(4) by application of physical force to disrupt the cells in thepresence of isolation buffers for stabilization of the releasedchromosomes, (5) dispersal of chromosomes in the presence of isolationbuffers for stabilization of free chromosomes, (6) separation ofartificial from endogenous chromosomes and (7) storage (and shipping ifdesired) of the isolated artificial chromosomes in appropriate buffers.Modifications and variations of the general procedure for isolation ofartificial chromosomes, for example to accommodate different cell typeswith differing growth characteristics and requirements and to optimizethe duration of mitotic block with arresting agents to obtain thedesired balance of chromosome yield and level of debris, may beempirically determined.

[0494] Steps 1-5 relate to isolation of metaphase chromosomes. Theseparation of artificial from endogenous chromosomes (step 6) may beaccomplished in a variety of ways. For example, the chromosomes may bestained with DNA-specific dyes such as Hoeschst 33258 and chromomycin A₃and sorted into artificial and endogenous chromosomes on the basis ofdye content by employing fluorescence-activated cell sorting (FACS). Tofacilitate larger scale isolation of the artificial chromosomes,different separation techniques may be employed such as swinging bucketcentrifugation (to effect separation based on chromosome size anddensity) [see, e.g., Mendelsohn et al. (1968) J. Mol. Biol. 32:101-108],zonal rotor centrifugation (to effect separation on the basis ofchromosome size and density) [see, e.g., Burki et al. (1973) Prep.Biochem. 3:157-182; Stubblefield et al. (1978) Biochem. Biophys. Res.Commun. 83:1404-1414, velocity sedimentation (to effect separation onthe basis of chromosome size and shape) [see e.g., Collard et al. (1984)Cytometry 5:9-19]. Immuno-affinity purification may also be employed inlarger scale artificial chromosome isolation procedures. In thisprocess, large populations of artificial chromosome-containing cells(asynchronous or mitotically enriched) are harvested en masse and themitotic chromosomes (which can be released from the cells using standardprocedures such as by incubation of the cells in hypotonic buffer and/ordetergent treatment of the cells in conjunction with physical disruptionof the treated cells) are enriched by binding to antibodies that arebound to solid state matrices (e.g. column resins or magnetic beads).Antibodies suitable for use in this procedure bind to condensedcentromeric proteins or condensed and DNA-bound histone proteins. Forexample, autoantibody LU851 (see Hadlaczky et al. (1989) Chromosoma97:282-288), which recognizes mammalian centromeres may be used forlarge-scale isolation of chromosomes prior to subsequent separation ofartificial from endogenous chromosomes using methods such as FACS. Thebound chromosomes would be washed and eventually eluted for sorting.Immunoaffinity purification may also be used directly to separateartificial chromosomes from endogenous chromosomes. For example, SATACsmay be generated in or transferred to (e.g., by microinjection ormicrocell fusion as described herein) a cell line that has chromosomesthat contain relatively small amounts of heterochromatin, such ashamster cells (e.g., V79 cells or CHO-K1 cells). The SATACs, which arepredominantly heterochromatin, are then separated from the endogenouschromosomes by utilizing anti-heterochromatin binding protein(Drosophila HP-1) antibody conjugated to a solid matrix. Such matrixpreferentially binds SATACs relative to hamster chromosomes. Unboundhamster chromosomes are washed away from the matrix and the SATACs areeluted by standard techniques.

[0495] A. Cell lines and cell culturing procedures

[0496] In one isolation procedure, 1B3 mouse-hamster-human hybrid cells[see, FIG. 4] carrying the megachromosome or the truncatedmegachromosome were grown in F-12 medium supplemented with 10% fetalcalf serum, 150 μg/ml hygromycin B and 400 μg/ml G418. GHB42 [a cellline recloned from G3D5 cells] mouse-hamster hybrid cells carrying themegachromosome and the minichromosome were also cultured in F-12 mediumcontaining 10% fetal calf serum, 150 μg/ml hygromycin B and 400 μg/mlG418. The doubling time of both cell lines was about 24-40 hours,typically about 32 hours.

[0497] Typically, cell monolayers are passaged when they reach about60-80% confluence and are split every 48-72 hours. Cells that reachgreater than 80% confluence senesce in culture and are not preferred forchromosome harvesting. Cells may be plated in 100-200 100-mm dishes atabout 50-70% confluency 12-30 hours before mitotic arrest (see, below).

[0498] Other cell lines that may be used as hosts for artificialchromosomes and from which the artificial chromosomes may be isolatedinclude, but are not limited to, PtK1 (NBL-3) marsupial kidney cells(ATCC accession no. CCL35), CHO-K1 Chinese hamster ovary cells (ATCCaccession no. CCL61), V79-4 Chinese hamster lung cells (ATCC accessionno. CCL93), Indian muntjac skin cells (ATCC accession no. CCL157),LMTK(−) thymidine kinase deficient murine L cells (ATCC accession no.CCL1.3), Sf9 fall armyworm (Spodoptera frugiperda) ovary cells (ATCCaccession no. CRL 1711) and any generated heterokaryon (hybrid) celllines, such as, for example, the hamster-murine hybrid cells describedherein, that may be used to construct MACs, particularly SATACs.

[0499] Cell lines may be selected, for example, to enhance efficiency ofartificial chromosome production and isolation as may be desired inlarge-scale production processes. For instance, one consideration inselecting host cells may be the artificial chromosome-to-totalchromosome ratio of the cells. To facilitate separation of artificialchromosomes from endogenous chromosomes, a higher artificialchromosome-to-total chromosome ratio might be desirable. For example,for H1D3 cells (a murine/hamster heterokaryon; see FIG. 4), this ratiois 1:50, i.e., one artificial chromosome (the megachromosome) to 50total chromosomes. In contrast, Indian muntjac skin cells (ATCCaccession no. CCL157) contain a smaller total number of chromosomes (adiploid number of chromosomes of 7), as do kangaroo rat cells (a diploidnumber of chromosomes of 12) which would provide for a higher artificialchromosome-to-total chromosome ratio upon introduction of, or generationof, artificial chromosomes in the cells.

[0500] Another consideration in selecting host cells for production andisolation of artificial chromosomes may be size of the endogenouschromosomes as compared to that of the artificial chromosomes. Sizedifferences of the chromosomes may be exploited to facilitate separationof artificial chromosomes from endogenous chromosomes. For example,because Indian muntjac skin cell chromosomes are considerably largerthan minichromosomes and truncated megachromosomes, separation of theartificial chromosome from the muntjac chromosomes may possibly beaccomplished using univariate (one dye, either Hoechst 33258 orChromomycin A3) FACS separation procedures.

[0501] Another consideration in selecting host cells for production andisolation of artificial chromosomes may be the doubling time of thecells. For example, the amount of time required to generate a sufficientnumber of artificial chromosome-containing cells for use in proceduresto isolate artificial chromosomes may be of significance for large-scaleproduction. Thus, host cells with shorter doubling times may bedesirable. For instance, the doubling time of V79 hamster lung cells isabout 9-10 hours in comparison to the approximately 32-hour doublingtime of H1D3 cells.

[0502] Accordingly, several considerations may go into the selection ofhost cells for the production and isolation of artificial chromosomes.It may be that the host cell selected as the most desirable for de novoformation of artificial chromosomes is not optimized for large-scaleproduction of the artificial chromosomes generated in the cell line. Insuch cases, it may be possible, once the artificial chromosome has beengenerated in the initial host cell line, to transfer it to a productioncell line more well suited to efficient, high-level production andisolation of the artificial chromosome. Such transfer may beaccomplished through several methods, for example through microcellfusion, as described herein, or microinjection into the production cellline of artificial chromosomes purified from the generating cell lineusing procedures such as described herein. Production cell linespreferably contain two or more copies of the artificial chromosome percell.

[0503] B. Chromosome isolation

[0504] In general, cells are typically cultured for two generations atexponential growth prior to mitotic arrest. To accumulate mitotic 1B3and GHB42 cells in one particular isolation procedure, 5 μg/mlcolchicine was added for 12 hours to the cultures. The mitotic indexobtained was 60-80%. The mitotic cells were harvested by selectivedetachment by gentle pipetting of the medium on the monolayer cells. Itis also possible to utilize mechanical shake-off as a means of releasingthe rounded-up (mitotic) cells from the plate. The cells were sedimentedby centrifugation at 200×g for 10 minutes.

[0505] Cells (grown on plastic or in suspension) may be arrested indifferent stages of the cell cycle with chemical agents other thancolchicine, such as hydroxyurea, vinblastine, colcemid or aphidicolin.Chemical agents that arrest the cells in stages other than mitosis, suchas hydroxyurea and aphidicolin, are used to synchronize the cycles ofall cells in the population and then are removed from the cell medium toallow the cells to proceed, more or less simultaneously, to mitosis atwhich time they may be harvested to disperse the chromosomes. Mitoticcells could be enriched for a mechanical shake-off (adherent cells). Thecell cycles of cells within a population of MAC-containing cells mayalso be synchronized by nutrient, growth factor or hormone deprivationwhich leads to an accumulation of cells in the G₁ or G₀ stage;readdition of nutrients or growth factors then allows the quiescentcells to re-enter the the cell cycle in synchrony for about onegeneration. Cell lines that are known to respond to hormone deprivationin this manner, and which are suitable as hosts for artificialchromosomes, include the Nb2 rat lymphoma cell line which is absolutelydependent on prolactin for stimulation of proliferation (see Gout et al.(1980) Cancer Res. 40:2433-2436). Culturing the cells inprolactin-deficient medium for 18-24 hours leads to arrest ofproliferation, with cells accumulating early in the G₁ phase of the cellcycle. Upon addition of prolactin, all the cells progress through thecell cycle until M phase at which point greater than 90% of the cellswould be in mitosis (addition of colchicine could increase the amount ofthe mitotic cells to greater than 95%). The time between reestablishingproliferation by prolactin addition and harvesting mitotic cells forchromosome separation may be empirically determined.

[0506] Alternatively, adherent cells, such as V79 cells, may be grown inroller bottles and mitotic cells released from the plastic surface byrotating the roller bottles at 200 rpm or greater (Shwarchuk et al.(1993) Int. J. Radiat. Biol. 64:601-612). At any given time,approximately 1 % of the cells in an exponentially growing asynchronouspopulation is in M-phase. Even without the addition of colchicine, 2×10⁷mitotic cells have been harvested from four 1750-cm² roller bottlesafter a 5-min spin at 200 rpm. Addition of colchicine for 2 hours mayincrease the yield to 6×10⁸ mitotic cells.

[0507] Several procedures may be used to isolate metaphase chromosomesfrom these cells, including, but not limited to, one based on apolyamine buffer system [Cram et al. (1990) Methods in Cell Biology33:377-382], one on a modified hexylene glycol buffer system [Hadlaczkyet al. (1982) Chromosoma 86:643-651, one on a magnesium sulfate buffersystem [Van den Engh et al. (1988) Cytometry 9:266-270 and Van den Enghet al. (1984) Cytometry 5:108], one on an acetic acid fixation buffersystem [Stoehr et al. (1982) Histochemistry 74:57-61], and one on atechnique utilizing hypotonic KCl and propidium iodide [Cram et al.(1994) XVII meeting of the International Society for AnalyticalCytology, October 16-21, Tutorial IV Chromosome Analysis and Sortingwith Commerical Flow Cytometers; Cram et al. (1990) Methods in CellBiology 33:376].

[0508] 1. Polyamine procedure

[0509] In the polyamine procedure that was used in isolating artificialchromosomes from either 1B3 or GHB42 cells, about 10⁷ mitotic cells wereincubated in 10 ml hypotonic buffer (75 mM KCl, 0.2 mM spermine, 0.5 mMspermidine) for 10 minutes at room temperature to swell the cells. Thecells are swollen in hypotonic buffer to loosen the metaphasechromosomes but not to the point of cell lysis. The cells were thencentrifuged at 100×g for 8 minutes, typically at room temperature. Thecell pellet was drained carefully and about 10⁷ cells were resuspendedin 1 ml polyamine buffer [15 mM Tris-HCl, 20 mM NaCl, 80 mM KCl, 2 mMEDTA, 0.5 mM EGTA, 14 mM β-mercaptoethanol, 0.1% digitonin, 0.2 mMSpermine, 0.5 mM spermidine] for physical dispersal of the metaphasechromosomes. Chromosomes were then released by gently drawing the cellsuspension up and expelling it through a 22 G needle attached to a 3 mlplastic syringe. The chromosome concentration was about 1-3×10⁸chromosomes/ml.

[0510] The polyamine buffer isolation protocol is well suited forobtaining high molecular weight chromosomal DNA [Sillar and Young (1981)J. Histochem. Cytochem. 29:74-78; VanDilla et al. (1986) Biotechnology4:537-552; Bartholdi et al. (1988) In “Molecular Genetics of MammalianCells” (M.Goettsman, ed.), Methods in Enzymology 151:252-267. AcademicPress, Orlando]. The chromosome stabilizing buffer uses the polyaminesspermine and spermidine to stabilize chromosome structure [Blumenthal etal. (1979) J. Cell Biol. 81:255-259; Lalande et al. (1985) Cancer Genet.Cytogenet. 23:151-157] and heavy metals chelators to reduce nucleaseactivity.

[0511] The polyamine buffer protocol has wide applicability, however, aswith other protocols, the following variables must be optimized for eachcell type: blocking time, cell concentration, type of hypotonic swellingbuffer, swelling time, volume of hypotonic buffer, and vortexing time.Chromosomes prepared using this protocol are typically highly condensed.

[0512] There are several hypotonic buffers that may be used to swell thecells, for example buffers such as the following: 75 mM KCl; 75 mM KCl,0.2 mM spermine, 0.5 mM spermidine; Ohnuki's buffer of 16.2 mM sodiumnitrate, 6.5 mM sodium acetate, 32.4 mM KCl [Ohnuki (1965) Nature208:916-917 and Ohnuki (1968) Chromosoma 25:402-428]; and a variation ofOhnuki's buffer that additionally contains 0.2 mM spermine and 0.5 mMspermidine. The amount and hypotonicity of added buffer vary dependingon cell type and cell concentration. Amounts may range from 2.5-5.5 mlper 10⁷ cells or more. Swelling times may vary from 10-90 minutesdepending on cell type and which swelling buffer is used.

[0513] The composition of the polyamine isolation buffer may also bevaried. For example, one modified buffer contains 15 mM Tris-HCl, pH7.2, 70 mM NaCl, 80 mM KCl, 2 mM EDTA, 0.5 mM EGTA, 14 mMbetamercaptoethanol, 0.25% Triton-X, 0.2 mM spermine and 0.5 mMspermidine.

[0514] Chromosomal dispersal may also be accomplished by a variety ofphysical means. For example, cell suspension may be gently drawn up andexpelled in a 3-ml syringe fitted with a 22-gauge needle [Cram et al.(1990) Methods in Cell Biology 33:377-382], cell suspension may beagitated on a bench-top vortex [Cram et al. (1990) Methods in CellBiology 33:377-382], cell suspension may be disrupted with a homogenizer[Sillar and Young (1981) J. Histochem. Cytochem. 29:74-78; Carrano etal. (1979) Proc. Natl. Acad. Sci. U.S.A. 76:1382-1384] and cellsuspension may be disrupted with a bench-top ultrasonic bath [Stoehr etal. (1982) Histochemistry 74:57-61].

[0515] 2. Hexylene glycol buffer system

[0516] In the hexylene glycol buffer procedure that was used inisolating artificial chromosomes from either 1B3 or GHB42 cells, about8×10⁶ mitotic cells were resuspended in 10 ml glycine-hexylene glycolbuffer [100 mM glycine, 1% hexylene glycol, pH 8.4-8.6 adjusted withsaturated Ca-hydroxide solution] and incubated for 10 minutes at 37° C.,followed by centrifugation for 10 minutes to pellet the nuclei. Thesupernatant was centrifuged again at 200×g for 20 minutes to pellet thechromosomes. Chromosomes were resuspended in isolation buffer (1-3×10⁸chromosomes/ml).

[0517] The hexylene glycol buffer composition may also be modified. Forexample, one modified buffer contains 25 mM Tris-HCl, pH 7.2, 750 mMhexylene glycol, 0.5 mM CaCl₂, 1.0 mM MgCl₂ [Carrano et al. (1979) Proc.Natl. Acad. Sci. U.S.A. 76:1382-1384].

[0518] 3. Magnesium-sulfate buffer system

[0519] This buffer system may be used with any of the methods of cellswelling and chromosomal dispersal, such as described above inconnection with the polyamine and hexylene glycol buffer systems. Inthis procedure, mitotic cells are resuspended in the following buffer:4.8 mM HEPES, pH 8.0, 9.8 mM MgSO₄, 48 mM KCl, 2.9 mM dithiothreitol[Van den Engh et al. (1985) Cytometry 6:92 and Van den Engh et al.(1984) Cytometry 5:108].

[0520] 4. Acetic acid fixation buffer system

[0521] This buffer system may be used with any of the methods of cellswelling and chromosomal dispersal, such as described above inconnection with the polyamine and hexylene glycol buffer systems. Inthis procedure, mitotic cells are resuspended in the following buffer:25 mM Tris-HCl, pH 3.2, 750 mM (1,6)-hexandiol, 0.5 mM CaCl₂, 1.0%acetic acid [Stoehr et al. (1982) Histochemistry 74:57-61].

[0522] 5. KCl-propidium iodide buffer system

[0523] This buffer system may be used with any of the methods of cellswelling and chromosomal dispersal, such as described above inconnection with the polyamine and hexylene glycol buffer systems. Inthis procedure, mitotic cells are resuspended in the following buffer:25 mM KCl, 50 μg/ml propidium iodide, 0.33% Triton X-100, 333 μg/mlRNase [Cram et al. (1990) Methods in Cell Biology 33:376].

[0524] The fluorescent dye propidium iodide is used and also serves as achromosome stabilizing agent. Swelling of the cells in the hypotonicmedium (which may also contain propidium iodide) may be monitored byplacing a small drop of the suspension on a microscope slide andobserving the cells by phase/fluorescent microscopy. The cells shouldexclude the propidium iodide while swelling, but some may lyseprematurely and show chromosome fluorescence. After the cells have beencentrifuged and resuspended in the KCl-propidium iodide buffer system,they will be lysed due to the presence of the detergent in the buffer.The chromosomes may then be dispersed and then incubated at 37° C. forup to 30 minutes to permit the RNase to act. The chromosome preparationis then analyzed by flow cytometry. The propidium iodide fluorescencecan be excited at the 488 nm wavelength of an argon laser and detectedthrough an OG 570 optical filter by a single photomultiplier tube. Thesingle pulse may be integrated and acquired in an univariate histogram.The flow cytometer may be aligned to a CV of 2% or less using small (1.5μm diameter) microspheres. The chromosome preparation is filteredthrough 60 μm nylon mesh before analysis.

[0525] C. Staining of chromosomes with DNA-specific dyes

[0526] Subsequent to isolation, the chromosome preparation was stainedwith Hoechst 33258 at 6 μg/ml and chromomycin A3 at 200 μg/ml. Fifteenminutes prior to analysis, 25 mM Na-sulphite and 10 mM Na-citrate wereadded to the chromosome suspension.

[0527] D. Flow sorting of chromosomes

[0528] Chromosomes obtained from 1B3 and GHB42 cells and maintained weresuspended in a polyamine-based sheath buffer (0.5 mM EGTA, 2.0 mM EDTA,80 mM KCl, 70 mM NaCl, 15 mM Tris-HCl, pH 7.2, 0.2 mM spermine and 0.5mM spermidine) [Sillar and Young (1981) J. Histochem. Cytochem.29:74-78]. The chromosomes were then passed through a dual-laser cellsorter [FACStar Plus or FAXStar Vantage Becton Dickinson ImmunocytometrySystem; other dual-laser sorters may also be used, such as thosemanufactured by Coulter Electronics (Elite ESP) and Cytomation (MoFlo)]in which two lasers were set to excite the dyes separately, allowing abivariate analysis of the chromosome by size and base-pair composition.Because of the difference between the base composition of the SATACs andthe other chromosomes and the resulting difference in interaction withthe dyes, as well as size differences, the SATACs were separated fromthe other chromosomes.

[0529] E. Storage of the sorted artificial chromosomes

[0530] Sorted chromosomes may be pelleted by centrifugation andresuspended in a variety of buffers, and stored at 4° C. For example,the isolated artificial chromosomes may be stored in GH buffer (100 mMglycine, 1 % hexylene glycol pH 8.4-8.6 adjusted with saturatedCa-hydroxide solution) [see, e.g., Hadlaczky et al. (1982) Chromosoma86:643-659] for one day and embedded by centrifugation into agarose. Thesorted chromosomes were centrifuged into an agarose bed and the plugsare stored in 500 mM EDTA at 4° C. Additional storage buffers includeCMB-I/polyamine buffer (1 7.5 mM Tris-HCl, pH 7.4, 1.1 mM EDTA, 50 mMepsilon-amino caproic acid, 5 mM benzamide-HCl, 0.40 mM spermine, 1.0 mMspermidine, 0.25 mM EGTA, 40 mM KCl, 35 mM NaCl) and CMB-II/polyaminebuffer (100 mM glycine, pH 7.5, 78 mM hexylene glycol, 0.1 mM EDTA, 50mM epsilon-amino caproic acid, 5 mM benzamide-HCl, 0.40 mM spermine, 1.0mM spermidine, 0.25 mM EGTA, 40 mM KCl, 35 mM NaCl).

[0531] When microinjection is the intended use, the sorted chromosomesare stored in 30% glycerol at −20° C. Sorted chromosomes may also bestored without glycerol for short periods of time (3-6 days) in storagebuffers at 4° C. Exemplary buffers for microinjection include CBM-I (10mM Tris-HCl, pH 7.5, 0.1 mM EDTA, 50 mM epsilon-amino caproic acid, 5 mMbenzamide-HCl, 0.30 mM spermine, 0.75 mM spermidine), CBM-II (100 mMglycine, pH 7.5, 78 mM hexylene glycol, 0.1 mM EDTA, 50 mM epsilon-aminocaproic acid, 5 mM benzamide-HCl, 0.30 mM spermine, 0.75 mM spermidine).

[0532] For long-term storage of sorted chromosomes, the above buffersare preferably supplemented with 50% glycerol and stored at −20° C.

[0533] F. Quality control

[0534] 1. Analysis of the purity

[0535] The purity of the sorted chromosomes was checked by fluorescencein situ hybridization (FISH) with a biotin-labeled mouse satellite DNAprobe [see, Hadlaczky et al. (1991) Proc. Natl. Acad. Sci. U.S.A.88:8106-8110]. Purity of the isolated chromosomes was about 97-99%.

[0536] 2. Characteristics of the sorted chromosomes

[0537] Pulsed field gel electrophoresis and Southern hybridization werecarried out to determine the size distribution of the DNA content of thesorted artificial chromosomes.

[0538] G. Functioning of the purified artificial chromosomes

[0539] To check whether their activity is preserved, the purifiedartificial chromosomes may be microinjected (using methods such as thosedescribed in Example 13) into primary cells, somatic cells and stemcells which are then analyzed for expression of the heterologous genescarried by the artificial chromosomes, e.g., such as analysis for growthon selective medium and assays of β-galactosidase activity.

[0540] II. Sorting of mammalian artificial chromosome-containingmicrocells

[0541] A. Micronucleation

[0542] Cells were grown to 80-90% confluency in 4 T150 flasks. Colcemidwas added to a final concentration of 0.06 μg/ml, and then incubatedwith the cells at 37° C. for 24 hours.

[0543] B. Enucleation

[0544] Ten μg/ml cytochalasin B was added and the resulting microcellswere centrifuged at 15,000 rpm for 70 minutes at 28-33° C.

[0545] C. Purification of microcells by filtration

[0546] The microcells were purified using Swinnex filter units andNucleopore filters [5 μm and 3 μm].

[0547] D. Staining and sorting microcells

[0548] As above, the cells were stained with Hoechst and chromomycin A3dyes. The microcells were sorted by cell sorter to isolate themicrocells that contain the mammalian artificial chromosomes.

[0549] E. Fusion

[0550] The microcells that contain the artificial chromosome are fused,for example, as described in Example 1.A.5., to selected primary cells,somatic cells, embryonic stem cells to generate transgenic (non-human)animals and for gene therapy purposes, and to other cells to deliver thechromosomes to the cells.

EXAMPLE 11

[0551] Introduction of Mammalian Artificial Chromosomes into InsectCells

[0552] Insect cells are useful hosts for MACs, particularly for use inthe production of gene products, for a number of reasons, including:

[0553] 1. A mammalian artificial chromosome provides an extragenomicspecific integration site for introduction of genes encoding proteins ofinterest [reduced chance of mutation in production system].

[0554] 2. The large size of an artificial chromosome permits megabasesize DNA integration so that genes encoding an entire pathway leading toa protein or nonprotein of therapeutic value, such as an alkaloid[digitalis, morphine, taxol] can be accomodated by the artificialchromosome.

[0555] 3. Amplification of genes encoding useful proteins can beaccomplished in the artificial mammalian chromosome to obtain higherprotein yields in insect cells.

[0556] 4. Insect cells support required post-translational modifications(glycosylation, phosphorylation) essential for protein biologicalfunction.

[0557] 5. Insect cells do not support mammalian viruses—eliminatescross-contamination of product with human infectious agents.

[0558] 6. The ability to introduce chromosomes circumvents traditionalrecombinant baculovirus systems for production of nutritional,industrial or medicinal proteins in insect cell systems.

[0559] 7. The low temperature optimum for insect cell growth (28° C.)permits reduced energy cost of production.

[0560] 8. Serum free growth medium for insect cells will result in lowerproduction costs.

[0561] 9. Artificial chromosome-containing cells can be storedindefinitely at low temperature.

[0562] 10. Insect larvae will serve as biological factories for theproduction of nutritional, medicinal or industrial proteins bymicroinjection of fertilized insect eggs.

[0563] A. Demonstration that insect cells recognize mammalian promoters

[0564] Gene constructs containing a mammalian promoter, such as the CMVpromoter, linked to a detectable marker gene [Renilla luciferase gene(see, e.g., U.S. Pat. No. 5,292,658 for a description of DNA encodingthe Renilla luciferase, and plasmid pTZrLuc-1, which can provide thestarting material for construction of such vectors, see also SEQ ID No.10] and also including the simian virus 40 (SV40) promoter operablylinked to the β-galactosidase gene were introduced into the cells of twospecies Trichoplusia ni [cabbage looper] and Bombyx mori [silk worm].

[0565] After transferring the constructs into the insect cell lineseither by electroporation or by microinjection, expression of the markergenes was detected in luciferase assays (see e.g., Example 12.C.3) andin β-galactosidase assays (such as lacZ staining assays) after a 24-hincubation. In each case a positive result was obtained in the samplescontaining the genes which was absent in samples in which the genes wereomitted. In addition, a B. mori β-actin promoter-Renilla luciferase genefusion was introduced into the T. ni and B. mori cells which yieldedlight emission after transfection. Thus, certain mammalian promotersfunction to direct expression of these marker genes in insect cells.Therefore, MACs are candidates for expression of heterologous genes ininsect cells.

[0566] B. Construction of vectors for use in insect cells and fusionwith mammalian cells

[0567] 1. Transform LMTK- cells with expression vector with:

[0568] a. B. mori β-actin promoter—Hyg^(r) selectable marker gene forinsect cells, and

[0569] b. SV40 or CMV promoters controlling a puromycin^(r) selectablemarker gene for mammalian cells.

[0570] 2. Detect expression of the mammalian promoter in LMTK cells(puromycin^(r) LMTK cells)

[0571] 3. Use puromycin^(r) cells in fusion experiments with Bombyx andTrichoplusia cells, select Hygr cells.

[0572] C. Insertion of the MACs into insect cells

[0573] These experiments are designed to detect expression of adetectable marker gene [such as the 8-galactosidase gene expressed underthe control of a mammalian promoter, such as pSV40] located on a MACthat has been introduced into an insect cell. Data indicate that β-galwas expressed.

[0574] Insect cells are fused with mammalian cells containing mammalianartificial chromosomes, e.g., the minichromosome [EC3/7C5] or the miniand the megachromosome [such as GHB42, which is a cell line reclonedfrom G3D5] or a cell line that carries only the megachromosome [such asH1D3 or a redone therefrom]. Fusion is carried out as follows:

[0575] 1. mammalian +insect cells (50/50%) in log phase growth aremixed;

[0576] 2. calcium/PEG cell fusion: (10 min-0.5 h);

[0577] 3. heterokaryons (+72 h) are selected.

[0578] The following selection conditions to select for insect cellsthat contain a MAC can be used: [+=positive selection; −=negativeselection]:

[0579] 1. growth at 28° C. (+insect cells, −mammalian cells);

[0580] 2. Graces insect cell medium [SIGMA] (−mammalian cells);

[0581] 3. no exogenous CO₂ (−mammalian cells); and/or

[0582] 4. antibiotic selection (Hyg or G418) (+transformed insectcells).

[0583] Immediately following the fusion protocol, many heterokaryons[fusion events] are observed between the mammalian and each species ofinsect cells [up to 90% heterokaryons]. After growth [2+ weeks] oninsect medium containing G418 and/or hygromycin at selection levels usedfor selection of transformed mammalian cells, individual colonies aredetected growing on the fusion plates. By virtue of selection for theantibiotic resistance conferred by the MAC and selection for insectcells, these colonies should contain MACs.

[0584] The B. mori β-actin gene promoter has been shown to directexpression of the β-galactosidase gene in B. mori cells and mammaliancells (eq., EC3/7C5 cells). The B. mori β-actin gene promoter is, thus,particularly useful for inclusion in MACs generated in mammalian cellsthat will subsequently be transferred into insect cells because thepresence of any marker gene linked to the promoter can be determined inthe mammalian and resulting insect cell lines.

EXAMPLE 12

[0585] Preparation of Chromosome Fragmentation Vectors and Other Vectorsfor Targeted Integration of DNA into MACs

[0586] Fragmentation of the megachromosome should ultimately result insmaller stable chromosomes that contain about 15 Mb to 50 Mb that willbe easily manipulated for use as vectors. Vectors to effect suchfragmentation should also aid in determination and identification of theelements required for preparation of an in vitro-produced artificialchromosome.

[0587] Reduction in the size of the megachromosome can be achieved in anumber of different ways including: stress treatment, such as bystarvation, or cold or heat treatment; treatment with agents thatdestabilize the genome or nick DNA, such as BrdU, coumarin, EMS andothers; treatment with ionizing radiation [see, e.g., Brown (1992) Curr.Opin. Genes Dev. 2:479-486]; and telomere-directed in vivo chromosomefragmentation [see, e.g., Farr et al. (1995) EMBO J. 14:5444-5454].

[0588] A. Preparation of vectors for fragmentation of the artificialchromosome and also for targeted integration of selected gene products

[0589] 1. Construction of pTEMPUD

[0590] Plasmid pTEMPUD [see FIG. 5] is a mouse homologous recombination“killer” vector for in vivo chromosome fragmentation, and also forinducing large-scale amplification via site-specific integration. Withreference to FIG. 5, the ˜3,625-bp SalI-PstI fragment was derived fromthe pBabe-puro retroviral vector [see, Morgenstern et al. (1990) NucleicAcids Res. 18:3587-3596]. This fragment contains DNA encoding ampicillinresistance, the pUC origin of replication, and the puromycin N-acetyltransferase gene under control of the SV40 early promoter. The URA3 geneportion comes from the pYAC5 cloning vector [SIGMA]. URA3 was cut out ofpYAC5 with SalI-XhoI digestion, cloned into pNEB193 [New EnglandBiolabs], which was then cut with EcoRI-SalI and ligated to the SalIsite of pBabepuro to produce pPU.

[0591] A 1293-bp fragment [see SEQ ID No. 1] encoding the mouse majorsatellite, was isolated as an EcoRI fragment from a DNA library producedfrom mouse LMTK⁻ fibroblast cells and inserted into the EcoRI site ofpPU to produce pMPU.

[0592] The TK promoter-driven diphtheria toxin gene [DT-A] was derivedfrom pMC1 DT-A [see, Maxwell et al. (1986) Cancer Res. 46:4660-4666] byBglII-XhoI digestion and cloned into the pMClneo poly A expressionvector [STRATAGENE, La Jolla, Calif.] by replacing theneomycin-resistance gene coding sequence. The TK promoter, DT-A gene andpoly A sequence were removed from this vector, cohesive ends were filledwith Klenow and the resulting fragment blunt end-ligated and ligatedinto the SnaBI [TACGTA] of pMPU to produce pMPUD.

[0593] The Hutel 2.5-kb fragment [see SEQ ID No.3] was inserted at thePstI site [see the 6100 PstI-3625 PstI fragment on pTEMPUD] of pMPUD toproduce pTEMPUD. This fragment includes a human telomere. It includes aunique BglII site [see nucleotides 1042-1047 of SEQ ID No.3], which willbe used as a site for introduction of a synthetic telomere that includesmultiple repeats [80] of TTAGGG with BamHI and BglII ends for insertioninto the BglII site which will then remain unique, since the BamHIoverhang is compatible with the BglII site. Ligation of a BamHI fragmentto a BglII destroys the BglII site, so that only a single BglII sitewill remain. Selection for the unique BglII site insures that thesynthetic telomere will be inserted in the correct orientation. Theunique BglII site is the site at which the vector is linearized.

[0594] To generate a synthetic telomere made up of multiple repeats ofthe sequence TTAGGG, attempts were made to clone or amplify ligationproducts of 30-mer oligonucleotides containing repeats of the sequence.Two 30-mer oligonucleotides, one containing four repeats of TTAGGGbounded on each end of the complete run of repeats by half of a repeatand the other containing five repeats of the complement AATCCC, wereannealed. The resulting double-standed molecule with 3-bp protrudingends, each representing half of a repeat, was expected to ligate withitself to yield concatamers of n×30 bp. However, this approach wasunsuccessful, likely due to formation of quadruplex DNA from the G-richstrand. Similar difficulty has been encountered in attempts to generatelong repeats of the pentameric human satellite II and III units. Thus,it appears that, in general, any oligomer sequence containingperiodically spaced consecutive series of guanine nucleotides is likelyto form undesired quadruplex formation that hinders construction of longdouble-stranded DNAs containing the sequence.

[0595] Therefore, in another attempt to construct a synthetic telomerefor insertion into the BglII site of pTEMPUD, the starting material wasbased on the complementary C-rich repeat sequence (i.e., AATCCC) whichwould not be susceptible to quadruplex structure formation. Twoplasmids, designated pTEL280110 and pTel280111, were constructed asfollows to serve as the starting materials.

[0596] First, a long oligonucleotide containing 9 repeats of thesequence AATCCC (i.e., the complement of telomere sequence TTAGGG) inreverse order bounded on each end of the complete run of repeats by halfof a repeat (therefore, in essence, containing 10 repeats), andrecognition sites for PstI and PacI restriction enzymes was synthesizedusing standard methods. The oligonucleotide sequence is as follows: (SEQID NO.29) 5′-AAACTGCAGGTTAATTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCGGGAT-3′

[0597] A partially complementary short oligonucleotide of sequence

[0598] 3′-TTGGGCCCTAGGCTTAAGG-5′ (SEQ ID NO. 30)

[0599] was also synthesized. The oligonucleotides were gel-purified,annealed, repaired with Klenow polymerase and digested with EcoRI andPstI. The resulting EcoRI/PstI fragment was ligated withEcoRI/PstI-digested pUC19. The resulting plasmid was used to transformE. coli DH5α competent cells and plasmid DNA (pTel102) from one of thetransformants surviving selection on LB/ampicillin was digested withPacI, rendered blunt-ended by Klenow and dNTPs and digested withHindIII. The resulting 2.7-kb fragment was gel-purified.

[0600] Simultaneously, the same plasmid was amplified by the polymerasechain reaction using extended and more distal 26-mer M13 sequencingprimers. The amplification product was digested with SmaI and HindIII,the double-stranded 84-bp fragment containing the 60-bp telomeric repeat(plus 24 bp of linker sequence) was isolated on a 6% nativepolyacrylamide gel, and ligated with the double-digested pTel102 toyield a 120-bp telomeric sequence. This plasmid was used to transformDH5α cells. Plasmid DNA from two of the resulting recombinants thatsurvived selection on ampicillin (100 μg/ml) was sequenced on an ABI DNAsequencer using the dye-termination method. One of the plasmids,designated pTel29, contained a sequence of 20 repeats of the sequenceTTAGGG (i.e., 19 successive repeats of TTAGGG bounded on each end of thecomplete run of repeats with half of a repeat). The other plasmid,designated pTel28, had undergone a deletion of 2 bp (TA) at the junctionwhere the two sequences, each containing, in essence, 10 repeats of theTTAGGG sequence, that had been ligated to yield the plasmid. Thisresulted in a GGGTGGG motif at the junction in pTel28. This mutationprovides a useful tag in telomere-directed chromosome fragmentationexperiments. Therefore, the pTel29 insert was amplified by PCR usingpUC/M13 sequencing primers based on sequence somewhat longer and fartherfrom the polylinker than usual as follows:

[0601] 5′-GCCAGGGTTTTCCCAGTCACGACGT-3′ (SEQ ID NO. 31)

[0602] or in some experiments

[0603] 5′-GCTGCAAGGCGATTAAGTTGGGTAAC-3′ (SEQ ID NO. 32)

[0604] as the m13 forward primer, and

[0605] 5′-TATGTTGTGTGGAATTGTGAGCGGAT-3′ (SEQ ID NO. 33)

[0606] as the m13 reverse primer.

[0607] The amplification product was digested with SmaI and HindIII. Theresulting 144-bp fragment was gel-purified on a 6% native polyacrylamidegel and ligated with pTel28 that had been digested with PacI,blunt-ended with Klenow and dNTP and then digested with HindIII toremove linker. The ligation yielded a plasmid designated pTel2801containing a telomeric sequence of 40 repeats of the sequence TTAGGG inwhich one of the repeats (i.e., the 30th repeat) lacked two nucleotides(TA), due to the deletion that had occurred in pTel28, to yield a repeatas follows: TGGG.

[0608] In the next extension step, pTel2801 was digested with SmaI andHindIII and the 264-bp insert fragment was gel-purified and ligated withpTel2801 which had been digested with PacI, blunt-ended and digestedwith HindIII. The resulting plasmid was transformed into DH5α cells andplasmid DNA from 12 of the resulting transformants that survivedselection on ampicillin was examined by restriction enzyme analysis forthe presence of a 0.5-kb EcoRI/PstI insert fragment. Eleven of therecombinants contained the expected 0.5-kb insert. The inserts of two ofthe recombinants were sequenced and found to be as expected. Theseplasmids were designated pTel280110 and pTel280111. These plasmids,which are identical, both contain 80 repeats of the sequence TTAGGG, inwhich two of the repeats (i.e., the 30th and 70th repeats) lacked twonucleotides (TA), due to the deletion that had occurred in pTel28, toyield a repeat as follows: TGGG. Thus, in each of the cloning steps(except the first), the length of the synthetic telomere doubled; thatis, it was increasing in size exponentially. Its length was 60×2^(n) bp,wherein n is the number of extension cloning steps undertaken.Therefore, in principle (assuming E. coli, or any other microbial host,e.g., yeast, tolerates long tandem repetitive DNA), it is possible toassemble any desirable size of safe telomeric repeats.

[0609] In a further extension step, pTel280110 was digested with PacI,blunt-ended with Klenow polymerase in the presence of dNTP, thendigested with HindIII. The resulting 0.5-kb fragment was gel purified.Plasmid pTel280111 was cleaved with SmaI and HindIII and the 3.2-kbfragment was gel-purified and ligated to the 0.5-kb fragment frompTel280110. The resulting plasmid was used to transform DH5α cells.Plasmid DNA was purified from transformants surviving ampicillinselection. Nine of the selected recombinants were examined byrestriction enzyme analysis for the presence of a 1.0-kb EcoRI/PstIfragment. Four of the recombinants (designated pTlk2, pTlk6, pTlk7 andpTlk8) were thus found to contain the desired 960 bp telomere DNA insertsequence that included 160 repeats of the sequence TTAGGG in which fourof the repeats lacked two nucleotides (TA), due to the deletion that hadoccurred in pTel28, to yield a repeat as follows: TGGG. Partial DNAsequence analysis of the EcoRI/PstI fragment of two of these plasmids(i.e., pTIk2 and pTlk6), in which approximately 300 bp from both ends ofthe fragment were elucidated, confirmed that the sequence was composedof successive repeats of the TTAGGG sequence.

[0610] In order to add PmeI and BglII sites to the synthetic telomeresequence, pTlk2 was digested with PacI and PstI and the 3.7-kb fragment(i.e., 2.7-kb pUC19 and 1.0-kb repeat sequence) was gel-purified andligated at the PstI cohesive end with the following oligonucleotide5′-GGGTTTAAACAGATCTCTGCA-3′ (SEQ ID NO. 34). The ligation product wassubsequently repaired with Klenow polymerase and dNTP, ligated to itselfand transformed into E. coli strain DH5α. A total of 14 recombinantssurviving selection on ampicillin were obtained. Plasmid DNA from eachrecombinant was able to be cleaved with BglII indicating that this addedunique restriction site had been retained by each recombinant. Four ofthe 14 recombinants contained the complete 1-kb synthetic telomereinsert, whereas the insert of the remaining 10 recombinants hadundergone deletions of various lengths. The four plasmids in which the1-kb synthetic telomere sequence remained intact were designated pTlkV2,pTlkV5, pTlkV8 an pTlkV12. Each of these plasmids could also be digestedwith PmeI; in addition the presence of both the BglII nad PmeI sites wasverified by sequence analysis. Any of these four plasmids can bedigested with BamHI and BglII to release a fragment containing the 1-kbsynthetic telomere sequence which is then ligated with BglII-digestedpTEMPUD.

[0611] 2. Use of pTEMPUD for in vivo chromosome fragmentation

[0612] Linearization of pTEMPUD by BglII results in a linear moleculewith a human telomere at one end. Integration of this linear fragmentinto the chromosome, such as the megachromosome in hybrid cells or anymouse chromosome which contains repeats of the mouse major satellitesequence results in integration of the selectable markerpuromycin-resistance gene and cleavage of the plasmid by virtue of thetelomeric end. The DT gene prevents that entire linear fragment fromintegrating by random events, since upon integration and expression itis toxic. Thus random integration will be toxic, so site-directedintegration into the targeted DNA will be selected. Such integrationwill produce fragmented chromosomes.

[0613] The fragmented truncated chromosome with the new telomere willsurvive, and the other fragment without the centromere will be lost.Repeated in vivo fragmentations will ultimately result in selection ofthe smallest functioning artificial chromosome possible. Thus, thisvector can be used to produce minichromosomes from mouse chromosomes, orto fragment the megachromosome. In principle, this vector can be used totarget any selected DNA sequence in any chromosome to achievefragmentation.

[0614] 3. Construction of pTERPUD

[0615] A fragmentation/targeting vector analogous to pTEMPUD for in vivochromosome fragmentation, and also for inducing large-scaleamplification via site-specific integration but which is based on mouserDNA sequence instead of mouse major satellite DNA has been designatedpTERPUD. In this vector, the mouse major satellite DNA sequence ofpTEMPUD has been replaced with a 4770-bp BamHI fragment ofmegachromosome clone 161 which contains sequence corresponding tonucleotides 10,232-15,000 in SEQ ID NO. 16.

[0616] 4. pHASPUD and pTEMPhu3

[0617] Vectors that specifically target human chromosomes can beconstructed from pTEMPUD. These vectors can be used to fragment specifichuman chromosomes, depending upon the selected satellite sequence, toproduce human minichromosomes, and also to isolate human centromeres.

[0618] a. pHASPUD

[0619] To render pTEMPUD suitable for fragmenting human chromosomes, themouse major satellite sequence is replaced with human satellitesequences. Unlike mouse chromosomes, each human chromosome has a uniquesatellite sequence. For example, the mouse major satellite has beenreplaced with a human hexameric α-satellite [or alphoid satellite] DNAsequence. This sequence is an 813-bp fragment [nucleotide 232-1044 ofSEQ ID No. 2] from clone pS12, deposited in the EMBL database underAccession number X60716, isolated from a human colon carcinoma cell lineColo320 [deposited under Accession No. ATCC CCL 220.1]. The 813-bpalphoid fragment can be obtained from the pS12 clone by nucleic acidamplification using synthetic primers, each of which contains an EcoRIsite, as follows: [SEQ ID No.4] GGGGAATTCAT TGGGATGTTT GAGTTGA forwardprimer [SEQ ID No.5] CGAAAGTCCCC CCTAGGAGAT CTTAAGGA reverse primer.

[0620] Digestion of the amplified product with EcoRI results in afragment with EcoRI ends that includes the human α-satellite sequence.This sequence is inserted into pTEMPUD in place of the EcoRI fragmentthat contains the mouse major satellite to yield pHASPUD.

[0621] Vector pHASPUD was linearized with BglII and used to transformEJ30 (human fibroblast) cells by scrape loading. Twenty-sevenpuromycin-resistant transformant strains were obtained.

[0622] b. pTEMPhu3

[0623] In pTEMPhu3, the mouse major satellite sequence is replaced bythe 3kb human chromosome 3-specific α-satellite from D3Z1 [depositedunder ATCC Accession No. 85434; see, also Yrokov (1989) Cytogenet. CellGenet. 51:1114].

[0624] 5. Use of the pTEMPHU3 to induce amplification on humanchromosome #3

[0625] Each human chromosome contains unique chromosome-specific alphoidsequence. Thus, pTEMPHU3, which is targeted to the chromosome 3-specificα-satellite, can be introduced into human cells under selectiveconditions, whereby large-scale amplification of the chromosome 3centromeric region and production of a de novo chromosome ensues. Suchinduced large-scale amplification provides a means for inducing de novochromosome formation and also for in vivo cloning of defined humanchromosome fragments up to megabase size.

[0626] For example, the break-point in human chromosome 3 is on theshort arm near the centromere. This region is involved in renal cellcarcinoma formation. By targeting pTEMPhu3 to this region, the inducedlarge-scale amplification may contain this region, which can then becloned using the bacterial and yeast markers in the pTEMPhu3 vector.

[0627] The pTEMPhu3 cloning vector allows not only selection forhomologous recombinants, but also direct cloning of the integration sitein YACS. This vector can also be used to target human chromosome 3,preferably with a deleted short arm, in a mouse-human monochromosomalmicrocell hybrid line. Homologous recombinants can be screened bynucleic acid amplification (PCR), and amplification can be screened byDNA hybridization, Southern hybridization, and in situ hybridization.The amplified region can be cloned into a YAC. This vector and thesemethods also permit a functional analysis of cloned chromosome regionsby reintroducing the cloned amplified region into mammalian cells.

[0628] B. Preparation of libraries in YAC vectors for cloning ofcentromeres and identification of functional chromosomal units

[0629] Another method that may be used to obtain smaller-sizedfunctional mammalian artificial chromosome units and to clonecentromeric DNA involves screening of mammalian DNA YAC vector-basedlibraries and functional analysis of potential positive clones in atransgenic mouse model system. A mammalian DNA library is prepared in aYAC vector, such as YRT2 [see Schedl et al. (1993) Nuc. Acids Res.21:4783-4787], which contains the murine tyrosinase gene. The library isscreened for hybridization to mammalian telomere and centromere sequenceprobes. Positive clones are isolated and microinjected into pronuclei offertilized oocytes of NMRI/Han mice following standard techniques. Theembryos are then transferred into NMRI/Han foster mothers. Expression ofthe tyrosinase gene in transgenic offspring confers an identifiablephenotype (pigmentation). The clones that give rise totyrosinase-expressing transgenic mice are thus confirmed as containingfunctional mammalian artificial chromosome units.

[0630] Alternatively, fragments of SATACs may be introduced into the YACvectors and then introduced into pronuclei of fertilized oocytes ofNMRI/Han mice following standard techniques as above. The clones thatgive rise to tyrosinase-expressing transgenic mice are thus confirmed ascontaining functional mammalian artificial chromosome units,particularly centromeres.

[0631] C. Incorporation of Heterologous Genes into Mammalian ArtificialChromosomes through The Use of Homology Targeting Vectors

[0632] As described above, the use of mammalian artificial chromosomesfor expression of heterologous genes obviates certain negative effectsthat may result from random integration of heterologous plasmid DNA intothe recipient cell genome. An essential feature of the mammalianartificial chromosome that makes it a useful tool in avoiding thenegative effects of random integration is its presence as anextra-genomic gene source in recipient cells. Accordingly, methods ofspecific, targeted incorporation of heterologous genes exclusively intothe mammalian artificial chromosome, without extraneous randomintegration into the genome of recipient cells, are desired forheterologous gene expression from a mammalian artificial chromosome.

[0633] One means of achieving site-specific integration of heterologousgenes into artificial chromosomes is through the use of homologytargeting vectors. The heterologous gene of interest in subcloned into atargeting vector which contains nucleic acid sequences that arehomologous to nucleotides present in the artificial chromosome. Thevector is then introduced into cells containing the artificialchromosome for specific site-directed integration into the artificialchromosome through a recombination event at sites of homology betweenthe vector and the chromosome. The homology targeting vectors may alsocontain selectable markers for ease of identifying cells that haveincorporated the vector into the artificial chromosome as well as lethalselection genes that are expressed only upon extraneous integration ofthe vector into the recipient cell genome. Two exemplary homologytargeting vectors, ICF-7 and pACF-7-DTA, are described below.

[0634] 1. Construction of Vector λCF-7

[0635] Vector λCF-7 contains the cystic fibrosis transmembraneconductance regulator [CFTR] gene as an exemplary therapeuticmolecule-encoding nucleic acid that may be incorporated into mammalianartificial chromosomes for use in gene therapy applications. Thisvector, which also contains the puromycin-resistance gene as aselectable marker, as well as the Saccharomyces cerevisiae ura3 gene[orotidine-5-phosphate decarboxylase], was constructed in a series ofsteps as follows.

[0636] a. Construction of pURA

[0637] Plasmid pURA was prepared by ligating a 2.6-kb SalI/XhoI fragmentfrom the yeast artificial chromosome vector pYAC5 [Sigma; see also Burkeet al. (1987) Science 236:806-812 for a description of YAC vectors aswell as GenBank Accession no. U01086 for the complete sequence of pYAC5]containing the S. cerevisiae ura3 gene with a 3.3-kb SalI/SmaI fragmentof pHyg [see, e.g., U.S. Pat. Nos. 4,997,764, 4,686,186 and 5,162,215,.and the description above]. Prior to ligation the XhoI end was treatedwith Klenow polymerase for blunt end ligation to the SmaI end of the 3.3kb fragment of pHyyg. Thus, pURA contains the S. cerevisiae ura3 gene,and the E. coli ColE1 origin of replication and theampicillin-resistance gene. The uraE gene is included to provide a meansto recover the integrated construct from a mammalian cell as a YACclone.

[0638] b. Construction of pUP2

[0639] Plasmid pURA was digested with SalI and ligated to a 1.5-kb SalIfragment of pCEPUR. Plasmid pCEPUR is produced by ligating the 1.1 kbSnaBI-NhaI fragment of pBabe-puro [Morgenstern et al. (1990) Nucl. AcidsRes. 18:3587-3596; provided by Dr. L. Székely (Microbiology andTumorbiology Center, Karolinska Institutet, Stockholm); see, alsoTonghua et al. (1995) Chin. Med. J. (Beijing, Engl. Ed.) 108:653-659;Couto et al. (1994) Infect. Immun. 62:2375-2378; Dunckley et al. (1992)FEBS Lett. 296:128-34; French et al. (1995) Anal. Biochem. 228:354-355;Liu et al. (1995) Blood 85:1095-1103; International PCT application Nos.WO 9520044; WO 9500178, and WO 9419456] to the NheI-NruI fragment ofpCEP4 [Invitrogen].

[0640] The resulting plasmid, pUP2, contains the all the elements ofpURA plus the puromycin-resistance gene linked to the SV40 promoter andpolyadenylation signal from pCEPUR.

[0641] C. Construction of pUP-CFTR

[0642] The intermediate plasmid pUP-CFTR was generated in order tocombine the elements of pUP2 into a plasmid along with the CFTR gene.First, a 4.5-kb SalI fragment of pCMV-CFTR that contains theCFTR-encoding DNA [see, also, Riordan et al. (1989) Science245:1066-1073, U.S. Pat. No. 5,240,846, and Genbank Accession no. M28668for the sequence of the CFTR gene] containing the CFTR gene only wasligated to XhoI-digested pCEP4 [Invitrogen and also described herein] inorder to insert the CFTR gene in the multiple cloning site of theEpstein Barr virus-based (EBV) vector pCEP4 [Invitrogen, San Diego,Calif.; see also Yates et al. (1985) Nature 313:812-815; see, also U.S.Pat. No. 5,468,615] between the CMV promoter and SV40 polyadenylationsignal. The resulting plasmid was designated pCEP-CFTR. PlasmidpCEP-CFTR was then digested with SalI and the 5.8-kb fragment containingthe CFTR gene flanked by the CMV promoter and SV40 polyadenylationsignal was ligated to SalI-digested pUP2 to generate pUP-CFTR. Thus,pUP-CFTR contains all elements of pUP2 plus the CFTR gene linked to theCMV promoter and SV40 polyadenylation signal.

[0643] d. Construction of ACF-7

[0644] Plasmid pUP-CFTR was then linearized by partial digestion withEcoRI and the 13 kb fragment containing the CFTR gene was ligated withEcoRI-digested Charon 4AA [see Blattner et al. (1977) Science 196:161;Williams and Blattner (1979) J. Virol. 29:555 and Sambrook et al. (1989)Molecular Cloning, A Laboratory Manual, Second Ed., Cold Spring HarborLaboratory Press, Volume 1, Section 2.18, for descriptions of Charon4Aλ]. The resulting vector, λCF8, contains the Charon 4Aλ bacteriophageleft arm, the CFTR gene linked to the CMV promoter and SV40polyadenylation signal, the ura3 gene, the puromycin-resistance genelinked to the SV40 promoter and polyadenylation signal, the thymidinekinase promoter [TK], the ColE1 origin of replication, the amplicillinresistance gene and the Charon 4Aλ bacteriophage right arm. The λCF8construct was then digested with XhoI and the resulting 27.1 kb wasligated to the 0.4 kb XhoI/EcoRI fragment of pJBP86 [described below],containing the SV40 polyA signal and the EcoRI-digested Charon 4A λright arm. The resulting vector λCF-7 contains the Charon 4A λ left arm,the CFTR encoding DNA linked to the CMV promoter and SV40 polyA signal,the ura3 gene, the puromycin resistance gene linked to the SV40 promoterand polyA signal and the Charon 4A λ right arm. The λ DNA fragmentsprovide encode sequences homologous to nucleotides present in theexemplary artificial chromosomes.

[0645] The vector is then introduced into cells containing theartificial chromosomes exemplified herein. Accordingly, when the linearλCF-7 vector is introduced into megachromosome-carrying fusion celllines, such as described herein, it will be specifically integrated intothe megachromosome through recombination between the homologousbacteriophage A sequences of the vector and the artificial chromosome.

[0646] 2. Construction of Vector λCF-7-DTA

[0647] Vector ACF-7-DTA also contains all the elements contained inλCF-7, but additionally contains a lethal selection marker, thediphtheria toxin-A (DT-A) gene as well as the ampicillin-resistance geneand an origin of replication. This vector was constructed in a series ofsteps as follows.

[0648] a. Construction of pJBP86

[0649] Plasmid pJBP86 was used in the construction of λCF-7, above. A1.5-kb SalI fragment of pCEPUR containing the puromycin-resistance genelinked to the SV40 promoter and polyadenylation signal was ligated toHindIII-digested pJB8 [see, e., Ish-Horowitz et al. (1981) Nucleic AcidsRes. 9:2989-2998; available from ATCC as Accession No. 37074;commercially available from Amersham, Arlington Heights, Ill.]. Prior toligation the Sail ends of the 1.5 kb fragment of pCEPUR and the HindIIIlinearized pJB8 ends were treated with Klenow polymerase. The resultingvector pJBP86 contains the puromycin resistance gene linked to the SV40promoter and polyA signal, the 1.8 kb COS region of Charon 4Aλ, theColE1 origin of replication and the ampicillin resistance gene.

[0650] b. Construction of pMEP-DTA

[0651] A 1.1-kb XhoI/SalI fragment of pMC1-DT-A [see, e.g., Maxwell etal. (1986) Cancer Res. 46:4660-4666] containing the diphtheria toxin-Agene was ligated to XhoI-digested pMEP4 [Invitrogen, San Diego, Calif.]to generate pMEP-DTA. To produce pMC1-DT-A, the coding region of the DTAgene was isolated as a 800 bp PstIHindIII fragment from p2249-1 andinserted into pMClneopolyA.[pMC1 available from Stratagene] in place ofthe neo gene and under the control of the TK promoter. The resultingconstruct pMC1 DT-A was digested with HindIII, the ends filled by Klenowand SalI linkers were ligated to produce a 1061 bp TK-DTA gene cassettewith an XhoI end [5′] and a SalI end containing the 270 bp TK promoterand the ˜790 bp DT-A fragment. This fragment was ligated intoXhoI-digested pMEP4 .

[0652] Plasmid pMEP-DTA thus contains the DT-A gene linked to the TKpromoter and SV40, ColE1 origin of replication and theampicillin-resistance gene.

[0653] C. Construction of pJB83-DTA9

[0654] Plasmid pJB8 was digested with HindIII and ClaI and ligated withan oligonucleotide [see SEQ ID NOs. 7 and 8 for the sense and antisensestrands of the oligonucleotide, respectively] to generate pJB83. Theoligonucleotide that was ligated to ClaI/HindIII-digested pJB8 containedthe recognition sites of SwaI, PacI and SrfI restriction endonucleases.These sites will permit ready linearization of the pACF-7-DTA construct.

[0655] Next, a 1.4-kb XhoI/SalI fragment of pMEP-DTA, containing theDT-A gene was ligated to SalI-digested pJB83 to generate pJB83-DTA9.

[0656] d. Construction of ACF-7-DTA

[0657] The 12-bp overhangs of ACF-7 were removed by Mung bean nucleaseand subsequent T4 polymerase treatments. The resulting 41.1-kb linearλCF-7 vector was then ligated to pFB83-DTA9 which had been digested withClaI and treated with T4 polymerase. The resulting vector, ACF-7-DTA,contains all the elements of λCF-7 as well as the DT-A gene linked tothe TK promoter and the SV40 polyadenylation signal, the 1.8 kB Charon4A λ COS region, the ampicillin-resistance gene[from pJB83-DTA9] and theCol E1 origin of replication [from pJB83-DT9A].

[0658] D. Targeting vectors using luciferase markers: Plasmid pMCT-RUC

[0659] Plasmid pMCT-RUC [14 kbp] was constructed for site-specifictargeting of the Renilla luciferase [see, e.g., U.S. Pat. Nos. 5,292,658and 5,418,155 for a description of DNA encoding Renilla luciferase, andplasmid pTZrLuc-1, which can provide the starting material forconstruction of such vectors] gene to a mammalian artificial chromosome.The relevant features of this plasmid are the Renilla luciferase geneunder transcriptional control of the human cytomegalovirusimmediate-early gene enhancer/promoter; the hygromycin-resistance genea, positive selectable marker, under the transcriptional control of thethymidine kinase promoter. In particular, this plasmid contains plasmidpAG60 [see, e.g., U.S. Pat. Nos. 5,118,620, 5,021,344, 5,063,162 and4,946,952; see, also Colbert-Garapin et al. (1981) J. Mol. Biol.150:1-14], which includes DNA (i.e., the neomycin-resistance gene)homologous to the minichromosome, as well as the Renilla andhygromycin-resistance genes, the HSV-tk gene under control of the tkpromoter as a negative selectable marker for homologous recombination,and a unique HpaI site for linearizing the plasmid.

[0660] This construct was introduced, via calcium phosphatetransfection, into EC3/7C5 cells [see, Lorenz et al. (1996) J. Biolum.Chemilum. 11:31 -37]. The EC3/7C5 cells were maintained as a monolayer[see, Gluzman (1981) Cell 23:175-183]. Cells at 50% confluency in 100 mmPetri dishes were used for calcium phosphate transfection [see, Harperet al. (1981) Chromosoma 83:431-439] using 10 μg of linearized pMCT-RUCper plate. Colonies originating from single transfected cells wereisolated and maintained in F-12 medium containing hygromycin (300 μg/mL)and 10% fetal bovine serum. Cells were grown in 100 mm Petri dishesprior to the Renilla luciferase assay.

[0661] The Renilla luciferase assay was performed [see, e.g., Matthewset al. (1977) Biochemistry 16:85-91]. Hygromycin-resistant cell linesobtained after transfection of EC3/7C5 cells with linearized plasmidpMCT-RUC [“B” cell lines] were grown to 100% confluency for measurementsof light emission in vivo and in vitro. Light emission was measured invivo after about 30 generations as follows: growth medium was removedand replaced by 1 mL RPMI 1640 containing coelenterazine [1 mmol/L finalconcentration]. Light emission from cells was then visualized by placingthe Petri dishes in a low light video image analyzer [HamamatsuArgus-100]. An image was formed after 5 min. of photon accumulationusing 100% sensitivity of the photon counting tube. For measuring lightemission in vitro, cells were trypsinized and harvested from one Petridish, pelleted, resuspended in 1 mL assay buffer [0.5 mol/L NaCl, 1mmol/L EDTA, 0.1 mol/L potassium phosphate, pH 7.4] and sonicated on icefor 10 s. Lysates were than assayed in a Turner TD-20e luminometer for10 s after rapid injection of 0.5 mL of 1 mmol/L coelenterazine, and theaverage value of light emission was recorded as LU [1 LU=1.6×10⁶ hu/sfor this instrument].

[0662] Independent cell lines of EC3/7C5 cells transfected withlinearized plasmid pMCT-RUC showed different levels of Renillaluciferase activity. Similar differences in light emission were observedwhen measurements were performed on lysates of the same cell lines. Thisvariation in light emission was probably due to a position effectresulting from the random integration of plasmid pMCT-RUC into the mousegenome, since enrichment for site targeting of the luciferase gene wasnot performed in this experiment.

[0663] To obtain transfectant populations enriched in cells in which theluciferase gene had integrated into the minichromosome, transfectedcells were grown in the presence of ganciclovis. This negative selectionmedium selects against cells in which the added pMCT-RUC plasmidintegrated into the host EC3/7C5 genome. This selection thereby enrichesthe surviving transfectant population with cells containing pMCT-RUC inthe minichromosome. The cells surviving this selection were evaluated inluciferase assays which revealed a more uniform level of luciferaseexpression. Additionally, the results of in situ hybridization assaysindicated that the Renilla luciferase gene was contained in theminichromosome in these cells, which further indicates successfultargeting of pMCT-RUC into the minichromosome.

[0664] Plasmid pNEM-1, a variant of pMCT-RUC which also contains λ DNAto provide an extended region of homology to the minichromosome [see,other targeting vectors, below], was also used to transfect EC3/7C5cells. Site-directed targeting of the Renilla luciferase gene and thehygromycin-resistance gene in pNEM-1 to the minichromosome in therecipient EC3/7C5 cells was achieved. This was verified by DNAamplification analysis and by in situ hybridization. Additionally,luciferase gene expression was confirmed in luciferase assays of thetransfectants.

[0665] E. Protein secretion targeting vectors

[0666] Isolation of heterologous proteins produced intracellularly inmammalian cell expression systems requires cell disruption underpotentially harsh conditions and purification of the recombinant proteinfrom cellular contaminants. The process of protein isolation may begreatly facilitated by secretion of the recombinantly produced proteininto the extracellular medium where there are fewer contaminants toremove during purification. Therefore, secretion targeting vectors havebeen constructed for use with the mammalian artificial chromosomesystem.

[0667] A useful model vector for demonstrating production and secretionof heterologous protein in mammalian cells contains DNA encoding areadily detectable reporter protein fused to an efficient secretionsignal that directs transport of the protein to the cell membrane andsecretion of the protein from the cell. Vectors pLNCX-ILRUC andpLNCX-ILRUCλ, described below, are examples of such vectors. Thesevectors contain DNA encoding an interleukin-2 (IL2) signalpeptide-Renilla reniformis luciferase fusion protein. The IL-2 signalpeptide [encoded by the sequence set forth in SEQ ID No. 9] directssecretion of the luciferase protein, to which it is linked, frommammalian cells. Upon secretion from the host mammalian cell, the IL-2signal peptide is cleaved from the fusion protein to deliver mature,active, luciferase protein to the extracellular medium. Successfulproduction and secretion of this heterologous protein can be readilydetected by performing luciferase assays which measure the light emittedupon exposure of the medium to the bioluminescent luciferin substrate ofthe luciferase enzyme. Thus, this feature will be useful when artificialchromosomes are used for gene therapy. The presence of a functionalartificial chromosome carrying an IL-Ruc fusion with the accompanyingtherapeutic genes will be readily monitored. Body fluids or tissues canbe sampled and tested for luciferase expression by adding luciferin andappropriate cofactors and observing the bioluminescence.

[0668] 1. Construction of Protein Secretion Vector pLNCX-ILRUC

[0669] Vector pLNCX-ILRUC contains a human IL-2 signal peptide-R.reniformis fusion gene linked to the human cytomegalovirus (CMV)immediate early promoter for constitutive expression of the gene inmammalian cells. The construct was prepared as follows.

[0670] a. Preparation of the IL-2 signal sequence-encoding DNA

[0671] A 69-bp DNA fragment containing DNA encoding the human IL-2signal peptide was obtained through nucleic acid amplification, usingappropriate primers for IL-2, of an HEK 293 cell line [see, e.g., U.S.Pat. No. 4,518,584 for an IL-2 encoding DNA; see, also SEQ ID No. 9; theIL-2 gene and corresponding amino acid sequence is also provided in theGenbank Sequence Database as accession nos. K02056 and J00264]. Thesignal peptide includes the first 20 amino acids shown in thetranslations provided in both of these Genbank entries and in SEQ ID NO.9. The corresponding nucleotide sequence encoding the first 20 aminoacids is also provided in these entries [see, e.g., nucleotides 293-52of accession no. K02056 and nucleotides 478-537 of accession no.J00264), as well as in SEQ ID NO. 9. The amplification primers includedan EcoRI site [GAATTC] for subcloning of the DNA fragment after ligationinto pGEMT [Promega]. The forward primer is set forth in SEQ ID No. 11and the sequence of the reverse primer is set forth in SEQ ID No. 12.[SEQ ID No.11] TTTGAATTCATGTACAGGATGCAACTCCTG forward [SEQ ID No.12]TTTGAATTCAGTAGGTGCACTGTTTGTGAC reverse

[0672] b. Preparation of the R. reniformis luciferase-encoding DNA

[0673] The initial source of the R. reniformis luciferase gene wasplasmid pLXSN-RUC. Vector pLXSN [see, e.g., U.S. Pat. Nos. 5,324,655,5,470,730, 5,468,634, 5,358,866 and Miller et al. (1989) Biotechniques7:980] is a retroviral vector capable of expressing heterologous DNAunder the transcriptional control of the retroviral LTR; it alsocontains the neomycin-resistance gene operatively linked for expressionto the SV40 early region promoter. The R. reniformis luciferase gene wasobtained from plasmid pTZrLuc-1 [see, e.g., U.S. Pat. No. 5,292,658; seealso the Genbank Sequence Database accession no. M63501; and see alsoLorenz et al. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:4438-4442] and isshown as SEQ ID NO. 10. The 0.97 kb EcoRI/SmaI fragment of pTZrLuc-1contains the coding region of the Renilla luciferase-encoding DNA.Vector pLXSN was digested with and ligated with the luciferase genecontained on a pLXSN-RUC, which contains the luciferase gene locatedoperably linked to the viral LTR and upstream of the SV40 promoter,which directs expression of the neomycin-resistance gene.

[0674] c. Fusion of DNA encoding the IL-2 Signal Peptide and the R.reniformis Luciferase Gene to Yield pLXSN-ILRUC

[0675] The pGEMT vector containing the IL-2 signal peptide-encoding DNAdescribed in 1.a. above was digested with EcoRI, and the resultingfragment encoding the signal peptide was ligated to EcoRI-digestedpLXSN-RUC. The resulting plasmid, called pLXSN-ILRUC, contains the IL-2signal peptide-encoding DNA located immediately upstream of the R.reniformis gene in pLXSN-RUC. Plasmid pLXSN-ILRUC was then used as atemplate for nucleic acid amplification of the fusion gene in order toadd a SmaI site at the 3′ end of the fusion gene. The amplificationproduct was subcloned into linearized [EcoRI/SmaI-digested] pGEMT[Promega] to generate ILRUC-pGEMT.

[0676] d. Introduction of the Fusion Gene into a Vector ContainingControl Elements for Expression in Mammalian Cells

[0677] Plasmid ILRUC-pGEMT was digested with KspI and SmaI to release afragment containing the IL-2 signal peptide-luciferase fusion gene whichwas ligated to HpaI-digested pLNCX. Vector pLNCX [see, e.g., U.S. Pat.Nos. 5,324,655 and 5,457,182; see, also Miller and Rosman (1989)Biotechniques 7:980-990] is a retroviral vector for expressingheterologous DNA under the control of the CMV promoter; it also containsthe neomycin-resistance gene under the transcriptional control of aviral promoter. The vector resulting from the ligation reaction wasdesignated pLNCX-ILRUC. Vector pLNCX-ILRUC contains the IL-2 signalpeptide-luciferase fusion gene located immediately downstream of the CMVpromoter and upstream of the viral 3′ LTR and polyadenylation signal inpLNCX. This arrangement provides for expression of the fusion gene underthe control of the CMV promoter. Placement of the heterologousprotein-encoding DNA [i.e., the luciferase gene] in operative linkagewith the IL-2 signal peptide-encoding DNA provides for expression of thefusion in mammalian cells transfected with the vector such that theheterologous protein is secreted from the host cell into theextracellular medium.

[0678] 2. Construction of Protein Secretion Targeting VectorpLNCX-ILRUCλ

[0679] Vector pLNCX-ILRUC may be modified so that it can be used tointroduce the IL-2 signal peptide-luciferase fusion gene into amammalian artificial chromosome in a host cell. To facilitate specificincorporation of the pLNCX-ILRUC expression vector into a mammalianartificial chromosome, nucleic acid sequences that are homologous tonucleotides present in the artificial chromosome are added to the vectorto permit site directed recombination.

[0680] Exemplary artificial chromosomes described herein contain λ phageDNA. Therefore, protein secretion targeting vector pLNCX-ILRUCλ wasprepared by addition of λ phage DNA [from Charon 4A arms] to produce thesecretion vector pLNCX-ILRUC.

[0681] 3. Expression and Secretion of R. reniformis Luciferase fromMammalian Cells

[0682] a. Expression of R. reniformis Luciferase Using pLNCX-ILRUC

[0683] Mammalian cells [LMTK⁻ from the ATCC] were transientlytransfected with vector pLNCX-ILRUC [˜10 μg] by electroporation [BIORAD,performed according to the manufacturer's instructions]. Stabletransfectants produced by growth in G418 for neo selection have alsobeen prepared.

[0684] Transfectants were grown and then analyzed for expression ofluciferase. To determine whether active luciferase was secreted from thetransfected cells, culture media were assayed for luciferase by additionof coelentrazine [see, e.g., Matthews et al. (1977) Biochemistry16:85-91].

[0685] The results of these assays establish that vector pLNCX-ILRUC iscapable of providing constitutive expression of heterologous DNA inmammalian host cells. Furthermore, the results demonstrate that thehuman IL-2 signal peptide is capable of directing secretion of proteinsfused to the C-terminus of the peptide. Additionally, these datademonstrate that the R. reniformis luciferase protein is a highlyeffective reporter molecule, which is stable in a mammalian cellenvironment, and forms the basis of a sensitive, facile assay for geneexpression.

[0686] b. Renilla reniformis luciferase appears to be secreted fromLMTK⁻ cells.

[0687] (i) Renilla luciferase assay of cell pellets

[0688] The following cells were tested:

[0689] cells with no vector: LMTK⁻ cells without vector as a negativecontrol;

[0690] cells transfected with pLNCX only;

[0691] cells transfected with RUC-pLNCX [Renilla luciferase gene inpLNCX vector];

[0692] cells transfected with pLNCX-ILRUC [vector containing the IL-2leader sequence+Renilla luciferase fusion gene in pLNCX vector].

[0693] Forty-eight hours after electroporation, the cells and culturemedium were collected. The cell pellet from 4 plates of cells wasresuspended in 1 ml assay buffer and was lysed by sonication. Twohundred μl of the resuspended cell pellet was used for each assay forluciferase activity [see, e.g., Matthews et al. (1977) Biochemistry16:85-91]. The assay was repeated three times and the averagebioluminescence measurement was obtained.

[0694] The results showed that there was relatively low backgroundbioluminescence in the cells transformed with pLNCX or the negativecontrol cells; there was a low level observed in the cell pellet fromcells containing the vector with the IL-2 leader sequence-luciferasegene fusion and more than 5000 RLU in the sample from cells containingRUC-pLNCX.

[0695] (ii) Renilla luciferase assay of cell medium

[0696] Forty milliliters of medium from 4 plates of cells were harvestedand spun down. Two hundred microliters of medium was used for eachluciferase activity assay. The assay was repeated several times and theaverage bioluminescence measurement was obtained. These results showedthat a relatively high level of bioluminescence was detected in the cellmedium from cells transformed with pLNCX-ILRUC; about 10-fold lowerlevels [slightly above the background levels in medium from cells withno vector or transfected with pLNCX only] was detected in the cellstransfected with RUC-pLNCX.

[0697] (iii) Conclusions

[0698] The results of these experiments demonstrated that Renillaluciferase appears to be secreted from LMTK⁻ cells under the directionof the IL-2 signal peptide. The medium from cells transfected withRenilla luciferase-encoding DNA linked to the DNA encoding the IL-2secretion signal had substantially higher levels of Renilla luciferaseactivity than controls or cells containing luciferase-encoding DNAwithout the signal peptide-encoding DNA. Also, the differences betweenthe controls and cells containing luciferase encoding-DNA demonstratethat the luciferase activity is specifically from luciferase, not from anon-specific reaction. In addition, the results from the medium ofRUC-pLNCX transfected cells, which is similar to background, show thatthe luciferase activity in the medium does not come from cell lysis, butfrom secreted luciferase.

[0699] c. Expression of R. reniformis Luciferase Using pLNCX-ILRUCλ

[0700] To express the IL-2 signal peptide-R. reniformis fusion gene froman mammalian artificial chromosome, vector pLNCX-ILRUCλ is targeted forsite-specific integration into a mammalian artificial chromosome throughhomologous recombination of the A DNA sequences contained in thechromosome and the vector. This is accomplished by introduction ofpLNCX-ILRUCλ into either a fusion cell line harboring mammalianartificial chromosomes or mammalian host cells that contain mammalianartificial chromosomes. If the vector is introduced into a fusion cellline harboring the artificial chromosomes, for example throughmicroinjection of the vector or transfection of the fusion cell linewith the vector, the cells are then grown under selective conditions.The artificial chromosomes, which have incorporated vector pLNCX-ILRUCλ,are isolated from the surviving cells, using purification procedures asdescribed above, and then injected into the mammalian host cells.

[0701] Alternatively, the mammalian host cells may first be injectedwith mammalian artificial chromosomes which have been isolated from afusion cell line. The host cells are then transfected with vectorpLNCX-ILRUCλ and grown.

[0702] The recombinant host cells are then assayed for luciferaseexpression as described above.

[0703] F. Other targeting vectors

[0704] These vectors, which are based on vector pMCT-RUC, rely onpositive and negative selection to insure insertion and selection forthe double recombinants. A single crossover results in incorporation ofthe DT-A, which kills the cell, double crossover recombinations deletethe DT-1 gene. 1. Plasmid pNEM1 contains: DT-A: Diphtheria toxin gene(negative selectable marker) Hyg: Hygromycin gene (positive selectablemarker) ruc: Renilla luciferase gene (non-selectable marker) 1: LTR-MMTVpromoter 2: TK promoter 3: CMV promoter MMR: Homology region (plasmidpAG60) 2. plasmid pNEM-2 and -3 are similar to pNEM 1 except fordifferent negative selectable markers: pNEM-1: diphtheria toxin gene as“—” selectable marker pNEM-2: hygromycin antisense gene as “—”selectable marker pNEM-3: thymidine kinase HSV-1 gene as “—” selectablemarker 3. Plasmid - λ DNA based homology: pNEMλ-1: base vector pNEMλ-2:base vector containing p5 = gene 1: LTR MMTV promoter 2: SV4O promoter3: CMV promoter 4: μTIIA promoter (metallothionein gene promoter)

EXAMPLE 13

[0705] Microinjection of Mammalian Cells with Plasmid DNA

[0706] These procedures will be used to microinject MACs into eukaryoticcells, including mammalian and insect cells.

[0707] The microinjection technique is based on the use of small glasscapillaries as a delivery system into cells and has been used forintroduction of DNA fragments into nuclei [see, e.g., Chalfie et al.(1994) Science 263:802-804]. It allows the transfer of almost any typeof molecules, e.g., hormones, proteins, DNA and RNA, into either thecytoplasm or nuclei of recipient cells This technique has no cell typerestriction and is more efficient than other methods, includingCa²⁺-mediated gene transfer and liposome-mediated gene transfer. About20-30% of the injected cells become successfully transformed.

[0708] Microinjection is performed under a phase-contrast microscope. Aglass microcapillary, prefilled with the DNA sample, is directed into acell to be injected with the aid of a micromanipulator. An appropriatesample volume (1-10 pl) is transferred into the cell by gentle airpressure exerted by a transjector connected to the capillary. Recipientcells are grown on glass slides imprinted with numbered squares forconvenient localization of the injected cells.

[0709] a. Materials and equipment

[0710] Nunclon tissue culture dishes 35×10 mm, mouse cell line EC3/7C5Plasmid DNA pCH110 [Pharmacia], Purified Green Florescent Protein (GFP)[GFPs from Aequorea and Renilla have been purified and also DNA encodingGFPs has been cloned; see, e.g., Prasher et al. (1992) Gene 111:229-233;International PCT Application No. WO 95/07463, which is based on U.S.application Ser. No. 08/119,678 and U.S. application Ser. No.08/192,2741, ZEISS Axiovert 100 microscope, Eppendorf transjector 5246,Eppendorf micromanipulator 5171, Eppendorf Cellocate coverslips,Eppendorf microloaders, Eppendorf femtotips and other standardequipment.

[0711] b. Protocol for injecting

[0712] (1) Fibroblast cells are grown in 35 mm tissue culture dishes(37° C., 5% C0₂) until the cell density reaches 80% confluency. Thedishes are removed from the incubator and medium is added to about a 5mm depth.

[0713] (2) The dish is placed onto the dish holder and the cellsobserved with 10× objective; the focus is desirably above the cellsurface.

[0714] (3) Plasmid or chromosomal DNA solution [1 ng/μl] and GFP proteinsolution are further purified by centrifuging the DNA sample at a forcesufficient to remove any particular debris [typically about 10,000 rpmfor 10 minutes in a microcentrifuge].

[0715] (4) Two 2 μl of the DNA solution (1 ng/μl) is loaded into amicrocapillary with an Eppendorf microloader. During loading, the loaderis inserted to the tip end of the microcapillary. GFP (1 mg/ml) isloaded with the same procedure.

[0716] (5) The protecting sheath is removed from the microcapillary andthe microcapillary is fixed onto the capillary holder connected with themicromanipulator.

[0717] (6) The capillary tip is lowered to the surface of the medium andis focussed on the cells gradually until the tip of the capillaryreaches the surface of a cell. The capillary is lowered further so thatthe it is inserted into the cell. Various parameters, such as the levelof the capillary, the time and pressure, are determined for theparticular equipment. For example, using the fibroblast cell line C5 andthe above-noted equipment, the best conditions are: injection time 0.4second, pressure 80 psi. DNA can then be automatically injected into thenuclei of the cells.

[0718] (7) After injection, the cells are returned to the incubator, andincubated for about 18-24 hours.

[0719] (8) After incubation the number of transformants can bedetermined by a suitable method, which depends upon the selectionmarker. For example, if green fluorescent protein is used, the assay canbe performed using UV light source and fluorescent filter set at 0-24hours after injection. If β-gal-containing DNA, such as DNA-derived frompHC110, has been injected, then the transformants can be assayed forβ-gal.

[0720] (c) Detection of β-galactosidase in cells injected with plasmidDNA

[0721] The medium is removed from the culture plate and the cells arefixed by addition of 5 ml of fixation Solution I: (1% glutaraldehyde;0.1 M sodium phosphate buffer, pH 7.0; 1 mM MgCl₂), and incubated for 15minutes at 37° C. Fixation Solution I is replaced with 5 ml of X-galSolution II: [0.2% X-gal, 10 mM sodium phosphate buffer (pH 7.0), 150 mMNaCl, 1 mM MgCl₂, 3.3 mM K₄Fe(CN)₆H₂O, 3.3 mM K₃Fe(CN)₆], and the platesare incubated for 30-60 minutes at 37° C. The X-gal solution is removedand 2 ml of 70% glycerol is added to each dish. Blue stained cells areidentified under a light microscope.

[0722] This method will be used to introduce a MAC, particularly the MACwith the anti-HIV megachromosome, to produce a mouse model for anti-HIVactivity.

EXAMPLE 14

[0723] Transgenic (Non-Human) Animals

[0724] Transgenic (non-human) animals can be generated that expressheterologous genes which confer desired traits, e.g., diseaseresistance, in the animals. A transgenic mouse is prepared to serve as amodel of a disease-resistant animal. Genes that encode vaccines or thatencode therapeutic molecules can be introduced into embryos or ES cellsto produce animals that express the gene product and thereby areresistant to or less susceptible to a particular disorder.

[0725] The mammalian artificial megachromosome and others of theartificial chromosomes, particularly the SATACs, can be used to generatetransgenic (non-human) animals, including mammals and birds, that stablyexpress genes conferring desired traits, such as genes conferringresistance to pathogenic viruses. The artificial chromosomes can also beused to produce transgenic (non-human) animals, such as pigs, that canproduce immunologically humanized organs for xenotransplantation.

[0726] For example, transgenic mice containing a transgene encoding ananti-HIV ribozyme provide a useful model for the development of stabletransgenic (non-human) animals using these methods. The artificialchromosomes can be used to produce transgenic (non-human) animals,particularly, cows, goats, mice, oxen, camels, pigs and sheep, thatproduce the proteins of interest in their milk; and to producetransgenic chickens and other egg-producing fowl, that producetherapeutic proteins or other proteins of interest in their eggs. Forexample, use of mammary gland-specific promoters for expression ofheterologous DNA in milk is known [see, e.g. U.S. Pat. No. 4,873,316].In particular, a milk-specific promoter or a promoter, preferably linkedto a milk-specific signal peptide, specifically activated in mammarytissue is operatively linked to the DNA of interest, thereby providingexpression of that DNA sequence in milk.

[0727] 1. Development of Control Transgenic Mice Expressing Anti-HIVRibozyme

[0728] Control transgenic mice are generated in order to comparestability and amounts of transgene expression in mice developed usingtransgene DNA carried on a vector (control mice) with expression in micedeveloped using transgenes carried in an artificial megachromosome.

[0729] a. Development of Control Transgenic Mice Expressingβ-galactosidase

[0730] One set of control transgenic mice was generated bymicroinjection of mouse embryos with the β-galactosidase gene alone. Themicroinjection procedure used to introduce the plasmid DNA into themouse embryos is as described in Example 13, but modified for use withembryos [see, e.g., Hogan et al. (1994) Manipulating the Mouse Embryo,A:Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y., see, especially pages 255-264 and Appendix 3]. Fertilizedmouse embryos [Strain CB6 obtained from Charles River Co.] were injectedwith 1 ng of plasmid pCH110 (Pharmacia) which had been linearized bydigestion with BamHI. This plasmid contains the β-galactosidase genelinked to the SV40 late promoter. The β-galactosidase gene productprovides a readily detectable marker for successful transgeneexpression. Furthermore, these control mice provide confirmation of themicroinjection procedure used to introduce the plasmid into the embryos.Additionally, because the megachromosome that is transferred to themouse embryos in the model system (see below) also contains theβ-galactosidase gene, the control transgenic mice that have beengenerated by injection of pCH110 into embryos serve as an analogoussystem for comparison of heterologous gene expression from a plasmidversus from a gene carried on an artificial chromosome.

[0731] After injection, the embryos are cultured in modified HTF mediumunder 5% CO₂ at 37° C. for one day until they divide to form two cells.The two-cell embryos are then implanted into surrogate mother femalemice [for procedures see, Manipulating the Mouse Embryo, A LaboratoryManual (1994) Hogan et al., eds., Cold Spring Harbor Laboratory Press,Cold Spring Harbor, N.Y., pp. 127 et seq.].

[0732] b. Development of Control Transgenic Mice Expressing Anti-HIVRibozyme

[0733] One set of anti-HIV ribozyme gene-containing control transgenicmice was generated by microinjection of mouse embryos with plasmidpCEPUR-132 which contains three different genes: (1) DNA encoding ananti-HIV ribozyme, (2) the puromycin-resistance gene and (3) thehygromycin-resistance gene. Plasmid pCEPUR-132 was constructed byligating portions of plasmid pCEP-132 containing the anti-HIV ribozymegene (referred to as ribozyme D by Chang et al. [(1990) Clin. Biotech.2:23-31]; see also U.S. Pat. No. 5,144,019 to Rossi et al..,particularly FIG. 4 of the patent) and the hygromycin-resistance genewith a portion of plasmid pCEPUR containing the puromycin-resistancegene.

[0734] Plasmid pCEP-132 was constructed as follows. Vector pCEP4(Invitrogen, San Diego, Calif.; see also Yates et al. (1985) Nature313:812-815) was digested with XhoI which cleaves in the multiplecloning site region of the vector. This ˜10.4-kb vector contains thehygromycin-resistance gene linked to the thymidine kinase gene promoterand polyadenylation signal, as well as the ampicillin-resistance geneand ColE1 origin of replication and EBNA-1 (Epstein-Barr virus nuclearantigen) genes and OriP. The multiple cloning site is flanked by thecytomegalovirus promoter and SV40 polyadenylation signal.

[0735] XhoI-digested pCEP4 was ligated with a fragment obtained bydigestion of plasmid 132 (see Example 4 for a description of thisplasmid) with XhoI and SalI. This XhoI/SalI fragment contains theanti-HIV ribozyme gene linked at the 3′ end to the SV40 polyadenylationsignal. The plasmid resulting from this ligation was designatedpCEP-132. Thus, in effect, pCEP-132 comprises pCEP4 with the anti-HIVribozyme gene and SV40 polyadenylation signal inserted in the multiplecloning site for CMV promoter-driven expression of the anti-HIV ribozymegene.

[0736] To generate pCEPUR-132, pCEP-132 was ligated with a fragment ofpCEPUR. pCEPUR was prepared by ligating a 7.7-kb fragment generated uponNheI/NruI digestion of pCEP4 with a 1.1-kb NheI/SnaBI fragment of pBabe[see Morgenstern and Land (1990) Nucleic Acids Res. 18:3587-3596 for adescription of pBabe] that contains the puromycin-resistance gene linkedat the 5′ end to the SV40 promoter. Thus, pCEPUR is made up of theampicillin-resistance and EBNA1 genes, as well as the ColE1 and OriPelements from pCEP4 and the puromycin-resistance gene from pBabe. Thepuromycin-resistance gene in pCEPUR is flanked by the SV40 promoter(from pBabe) at the 5′ end and the SV40 polyadenylation signal (frompCEP4) at the 3′ end.

[0737] Plasmid pCEPUR was digested with XhoI and SalI and the fragmentcontaining the puromycin-resistance gene linked at the 5′ end to theSV40 promoter was ligated with XhoI-digested pCEP-132 to yield the ˜12.1-kb plasmid designated pCEPUR-132. Thus, pCEPUR-132, in effect,comprises pCEP-132 with puromycin-resistance gene and SV40 promoterinserted at the XhoI site. The main elements of pCEPUR-132 are thehygromycin-resistance gene linked to the thymidine kinase promoter andpolyadenylation signal, the anti-HIV ribozyme gene linked to the CMVpromoter and SV40 polyadenylation signal, and the puromycin-resistancegene linked to the SV40 promoter and polyadenylation signal. The plasmidalso contains the ampicillin-resistance and EBNA1 genes and the ColE1origin of replication and OriP.

[0738] Zygotes were prepared from (C57BL/6JxCBA/J) F1 female mice [see,e.g., Manipulating the Mouse Embryo, A Laboratory Manual (1994) Hogan etal., eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor,N.Y., p. 429], which had been previously mated with a (C57BL/6JxCBA/J)F1 male. The male pronuclei of these F2 zygotes were injected [see,Manipulating the Mouse Embryo, A Laboratory Manual (1994) Hogan et al.,eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.]with pCEPUR-132 (˜3 μg/ml), which had been linearized by digestion withNruI. The injected eggs were then implanted in surrogate mother femalemice for development into transgenic offspring.

[0739] These primary carrier offspring were analyzed (as describedbelow) for the presence of the transgene in DNA isolated from tailcells. Seven carrier mice that contained transgenes in their tail cells(but that may not carry the transgene in all their cells, i.e., they maybe chimeric) were allowed to mate to produce non-chimeric or germ-lineheterozygotes. The heterozygotes were, in turn, crossed to generatehomozygote transgenic offspring.

[0740] 2. Development of Model Transgenic Mice Using MammalianArtificial Chromosomes

[0741] Fertilized mouse embryos are microinjected (as described above)with megachromosomes (1-10 pL containing 0-1 chromosomes/pL) isolatedfrom fusion cell line G3D5 or H1D3 (described above). Themegachromosomes are isolated as described herein. Megachromosomesisolated from either cell line carry the anti-HIV ribozyme (ribozyme D)gene as well as the hygromycin-resistance and β-galactosidase genes. Theinjected embryos are then developed into transgenic mice as describedabove.

[0742] Alternatively, the megachromosome-containing cell line G3D5* orH1D3 is fused with mouse embryonic stem cells [see, e.g., U.S. Pat. No.5,453,357, commerically available; see Manipulating the Mouse Embryo, ALaboratory Manual (1994) Hogan et al., eds., Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y., pages 253-289] followingstandard procedures see also, e.g., Guide to Techniques in MouseDevelopment in Methods in Enzymology Vol. 25, Wassarman and DePamphilis, eds. (1993), pages 803-932]. (It is also possible to deliverisolated megachromosomes into embryonic stem cells using the Microcellprocedure [such as that described above].) The stem cells are culturedin the presence of a fibroblast [e.g., STO fibroblasts that areresistant to hygromycin and puromycin]. Cells of the resultant fusioncell line, which contains megachromosomes carrying the transgenes [i.e.,anti-HIV ribozyme, hygromycin-resistance and β-galactosidase genes], arethen transplanted into mouse blastocysts, which are in turn implantedinto a surrogate mother female mouse where development into a transgenicmouse will occur.

[0743] Mice generated by this method are chimeric; the transgenes willbe expressed in only certain areas of the mouse, e.g., the head, andthus may not be expressed in all cells.

[0744] 3. Analysis of Transgenic Mice for Transgene Expression

[0745] Beginning when the transgenic mice, generated as described above,are three-to-four weeks old, they can be analyzed for stable expressionof the transgenes that were transferred into the embryos [or fertilizedeggs] from which they develop. The transgenic mice may be analyzed inseveral ways as follows.

[0746] a. Analysis of Cells Obtained from the Transgenic Mice

[0747] Cell samples [e.g., spleen, liver and kidney cells, lymphocytes,tail cells] are obtained from the transgenic mice. Any cells may betested for transgene expression. If, however, the mice are chimerasgenerated by microinjection of fertilized eggs or by fusion of embryonicstem cells with megachromosome-containing cells, only cells from areasof the mouse that carry the transgene are expected to express thetransgene. If the cells survive growth on hygromycin [or hygromycin andpuromycin or neomycin, if the cells are obtained from mice generated bytransfer of both antibiotic-resistance genes], this is one indicationthat they are stably expressing the transgenes. RNA isolated from thecells according to standard methods may also be analyzed by northernblot procedures to determine if the cells express transcripts thathybridize to nucleic acid probes based on the antibiotic-resistancegenes. Additionally, cells obtained from the transgenic mice may also beanalyzed for β-galactosidase expression using standard assays for thismarker enzyme [for example, by direct staining of the product of areaction involving β-galactosidase and the X-gal substrate, see, e.,Jones (1986) EMBO 5:3133-3142, or by measurement of β-galactosidaseactivity, see, e.g., Miller (1972) in Experiments in Molecular Geneticspp. 352-355, Cold Spring Harbor Press]. Analysis of β-galactosidaseexpression is particularly used to evaluate transgene expression incells obtained from control transgenic mice in which the only transgenetransferred into the embryo was the β-galactosidase gene.

[0748] Stable expression of the anti-HIV ribozyme gene in cells obtainedfrom the transgenic mice may be evaluated in several ways. First, DNAisolated from the cells according to standard procedures may besubjected to nucleic acid amplification using primers corresponding tothe ribozyme gene sequence. If the gene is contained within the cells,an amplified product of pre-determined size is detected uponhybridization of the reaction mixture to a nucleic acid probe based onthe ribozyme gene sequence. Furthermore, DNA isolated from the cells maybe analyzed using Southern blot methods for hybridization to such anucleic acid probe. Second, RNA isolated from the cells may be subjectedto northern blot hybridization to determine if the cells express RNAthat hybridizes to nucleic acid probes based on the ribozyme gene.Third, the cells may be analyzed for the presence of anti-HIV ribozymeactivity as described, for example, in Chang et al. (1990) Clin.Biotech. 2:23-31. In this analysis, RNA isolated from the cells is mixedwith radioactively labeled HIV gag target RNA which can be obtained byIn vitro transcription of gag gene template under reaction conditionsfavorable to in vitro cleavage of the gag target, such as thosedescribed in Chang et al. (1990) Clin. Biotech. 2:23-31. After thereaction has been stopped, the mixture is analyzed by gelelectrophoresis to determine if cleavage products smaller in size thanthe whole template are detected; presence of such cleavage fragments isindicative of the presence of stably expressed ribozyme.

[0749] b. Analysis of Whole Transgenic Mice

[0750] Whole transgenic mice that have been generated by transfer of theanti-HIV ribozyme gene [as well as selection and marker genes] intoembryos or fertilized eggs can additionally be analyzed for transgeneexpression by challenging the mice with infection with HIV. It ispossible for mice to be infected with HIV upon intraperitoneal injectionwith high-producing HIV-infected U937 cells [see, e.g., Locardi et al.(1992) J. Virol. 66:1649-1654]. Successful infection may be confirmed byanalysis of DNA isolated from cells, such as peripheral bloodmononuclear cells, obtained from transgenic mice that have been injectedwith HIV-infected human cells. The DNA of infected transgenic mice cellswill contain HIV-specific gag and env sequences, as demonstrated by, forexample, nucleic acid amplification using HIV-specific primers. If thecells also stably express the anti-HIV ribozyme, then analysis of RNAextracts of the cells should reveal the smaller gag fragments arising bycleavage of the gag transcript by the ribozyme.

[0751] Additionally, the transgenic mice carrying the anti-HIV ribozymegene can be crossed with transgenic mice expressing human CD4 (i.e., thecellular receptor for HIV) [see Gillespie et al. (1993) Mol. Cell. Biol.13:2952-2958; Hanna et al. (1994) Mol. Cell. Biol. 14:1084-1094; andYeung et al. (1994) J. Exp. Med. 180:1911-1920, for a description oftransgenic mice expressing human CD4]. The offspring of these crossedtransgenic mice expressing both the CD4 and anti-HIV ribozyme transgenesshould be more resistant to infection [as a result of a reduction in thelevels of active HIV in the cells] than mice expressing CD4 alone[without expressing anti-HIV ribozyme].

[0752] 4. Development of transgenic chickens using artificialchromosomes

[0753] The development of transgenic chickens has many applications inthe improvement of domestic poultry, an agricultural species ofcommercial significance, such as disease resistance genes and genesencoding therapeutic proteins. It appears that efforts in the area ofchicken transgenesis have been hampered due to difficulty in achievingstable expression of transgenes in chicken cells using conventionalmethods of gene transfer via random introduction into recipient cells.Artificial chromosomes are, therefore, particularly useful in thedevelopment of transgenic chickens because they provide for stablemaintenance of transgenes in host cells.

[0754] a. Preparation of artificial chromosomes for introduction oftransgenes into recipient chicken cells

[0755] (i) Mammalian artificial chromosomes

[0756] Mammalian artificial chromosomes, such as the SATACs andminichromosomes described herein, can be modified to incorporatedetectable reporter genes and/or transgenes of interest for use indeveloping transgenic chickens. Alternatively, chicken-specificartificial chromosomes can be constructed using the methods herein. Inparticular, chicken artificial chromosomes [CACs] can be prepared usingthe methods herein for preparing MACs; or, as described above, thechicken libraries can be introduced into MACs provided herein and theresulting MACs introduced into chicken cells and those that arefunctional in chicken cells selected.

[0757] As described in Examples 4 and 7, and elsewhere herein,artificial chromosome-containing mouse LMTK⁻-derived cell lines, orminichromosome-containing cell lines, as well as hybrids thereof, can betransfected with selected DNA to generate MACs [or CACs] that haveintegrated the foreign DNA for functional expression of heterologousgenes contained within the DNA.

[0758] To generate MACs or CACs containing transgenes to be expressed inchicken cells, the MAC-containing cell lines may be transfected with DNAthat includes λ DNA and transgenes of interest operably linked to apromoter that is capable of driving expression of genes in chickencells. Alternatively, the minichromosomes or MACs [or CACs], produced asdescribed above, can be isolated and introduced into cells, followed bytargeted integration of selected DNA. Vectors for targeted integrationare provided herein or can be constructed as described herein.

[0759] Promoters of interest include constitutive, inducible and tissue(or cell)-specific promoters known to those of skill in the art topromote expression of genes in chicken cells. For example, expression ofthe lacZ gene in chicken blastodermal cells and primary chickenfibroblasts has been demonstrated using a mouse heat-shock protein 68(hsp 68) promoter [phspPTlacZpA; see Brazolot et al. (1991) Mol. Reprod.Devel. 30:304-312], a Zn²⁺-inducible chicken metallothionein (cMt)promoter [pCBcMtlacZ; see Brazolot et al. (1991) Mol. Reprod. Devel.30:304-312], the constitutive Rous sarcoma virus and chicken β-actinpromoters in tandem [pmiwZ; see Brazolot et al. (1991) Mol. Reprod.Devel. 30:304-312] and the constitutive cytomegalovirus (CMV) promoter.Of particular interest herein are egg-specific promoters that arederived from genes, such as ovalbumin and lysozyme, that are expressedin eggs.

[0760] The choice of promoter will depend on a variety of factors,including, for example, whether the transgene product is to be expressedthroughout the transgenic chicken or restricted to certain locations,such as the egg. Cell-specific promoters functional in chickens includethe steroid-responsive promoter of the egg ovalbumin protein-encodinggene [see Gaub et al. (1987) EMBO J. 6:2313-2320; Tora et al. (1988)EMBO J. 7:3771-3778; Park et al. (1995) Biochem. Mol. Biol. Int.(Australia) 36:811-816].

[0761] (ii) Chicken artificial chromosomes

[0762] Additionally, chicken artificial chromosomes may be generatedusing methods described herein. For example, chicken cells, such asprimary chicken fibroblasts [see Brazolot et al. (1991) Mol. Reprod.Devel. 30:304-312], may be transfected with DNA that encodes aselectable marker [such as a protein that confers resistance toantibiotics] and that includes DNA (such as chicken satellite DNA) thattargets the introduced DNA to the pericentric region of the endogenouschicken chromosomes. Transfectants that survive growth on selectionmedium are then analyzed, using methods described herein, for thepresence of artificial chromosomes, including minichromosomes, andparticularly SATACs. An artificial chromosome-containing transfectantcell line may then be transfected with DNA encoding the transgene ofinterest [fused to an appropriate promoter] along with DNA that targetsthe foreign DNA to the chicken artificial chromosome.

[0763] b. Introduction of artificial chromosomes carrying transgenes ofinterest into recipient chicken cells

[0764] Cell lines containing artificial chromosomes that harbortransgene(s) of interest (i.e., donor cells) may be fused with recipientchicken cells in order to transfer the chromosomes into the recipientcells. Alternatively, the artificial chromosomes may be isolated fromthe donor cells, for example, using methods described herein [see, e.g.,Example 10], and directly introduced into recipient cells.

[0765] Exemplary chicken recipient cell lines include, but are notlimited to, stage X blastoderm cells [see, e.g., Brazolot et al. (1991)Mol. Reprod. Dev. 30:304-312; Etches et al. (1993) Poultry Sci.72:882-889; Petitte et al. (1990) Development 108:185-189] and chickzygotes [see, e.g., Love et al. (1994) Biotechnology 12:60-63].

[0766] For example, microcell fusion is one method for introduction ofartificial chromosomes into avian cells [see, e.g., Dieken et al.[(1996) Nature Genet. 12:174-182 for methods of fusing microcells withDT40 chicken pre-B cells]. In this method, microcells are prepared [forexample, using procedures described in Example 1.A.5] from theartificial chromosome-containing cell lines and fused with chickenrecipient cells.

[0767] Isolated artificial chromosomes may be directly introduced intochicken recipient cell lines through, for example, lipid-mediatedcarrier systems, such as lipofection procedures [see, e.g., Brazolot etal. (1991) Mol. Reprod. Dev. 30:304-312] or direct microinjection.Microinjection is generally preferred for introduction of the artificialchromosomes into chicken zygotes [see, e.g., Love et al. (1994)Biotechnology 12:60-63].

[0768] c. Development of transgenic chickens

[0769] Transgenic chickens may be developed by injecting recipient StageX blastoderm cells (which have received the artificial chromosomes) intoembryos at a similar stage of development [see, e.g., Etches et al.(1993) Poultry Sci. 72:882-889; Petitte et al. (1990) Development108:185-189; and Carsience et al. (1993) Development 117: 669-675]. Therecipient chicken embryos within the shell are candled and allowed tohatch to yield a germline chimeric chicken that will express thetransgene(s) in some of its cells.

[0770] Alternatively, the artificial chromosomes may be introduced intochick zygotes, for example through direct microinjection [see, e.g.,Love et al. (1994) Biotechnology 12:60-63], which thereby areincorporated into at least a portion of the cells in the chicken.Inclusion of a tissue-specific promoter, such an egg-specific promoter,will ensure appropriate expression of operatively-linked heterologousDNA.

[0771] The DNA of interest may also be introduced into a minichromosome,by methods provided herein. The minichromosome may either be oneprovided herein, or one generated in chicken cells using the methodsherein. The heterologous DNA will be introduced using a targetingvector, such as those provided herein, or constructed as providedherein.

[0772] Since modifications will be apparent to those of skill in thisart, it is intended that this invention be limited only by the scope ofthe appended claims.

0 SEQUENCE LISTING (1) GENERAL INFORMATION: (iii) NUMBER OF SEQUENCES:34 (2) INFORMATION FOR SEQ ID NO: 1: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 1293 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:GAATTCATCA TTTTTCANGT CCTCAAGTGG ATGTTTCTCA TTTNCCATGA TTTTAAGTTT 60TCTCGCCATA TTCCTGGTCC TACAGTGTGC ATTTCTCCAT TTTNCACGTT TTNCAGTGAT 120TTCGTCATTT TCAAGTCCTC AAGTGGATGT TTCTCATTTN CCATGAATTT CAGTTTTCTN 180GCCATATTCC ACGTCCTACA GNGGACATTT CTAAATTTNC CACCTTTTTC AGTTTTCCTC 240GCCATATTTC ACGTCCTAAA ATGTGTATTT CTCGTTTNCC GTGATTTTCA GTTTTCTCGC 300CAGATTCCAG GTCCTATAAT GTGCATTTCT CATTTNNCAC GTTTTTCAGT GATTTCGTCA 360TTTTTTCAAG TCGGCAAGTG GATGTTTCTC ATTTNCCATG ATTTNCAGTT TTCTTGNAAT 420ATTCCATGTC CTACAATGAT CATTTTTAAT TTTCCACCTT TTCATTTTTC CACGCCATAT 480TTCATGTCCT AAAGTGTATA TTTCTCCTTT TCCGCGATTT TCAGTTTTCT CGCCATATTC 540CAGGTCCTAC AGTGTGCATT CCTCATTTTT CACCTTTTTC ACTGATTTCG TCATTTTTCA 600AGTCGTCAAC TGGATCTTTC TAATTTTCCA TGATTTTCAG TTATCTTGTC ATATTCCATG 660TCCTACAGTG GACATTTCTA AATTTTCCAA CTTTTTCAAT TTTTCTCGAC ATATTTGACG 720TGCTAAAGTG TGTATTTCTT ATTTTCCGTG ATTTTCAGTT TTCTCGCCAT ATTCCAGGTC 780CTAATAGTGT GCATTTCTCA TTTTTCACGT TTTTCAGTGA TTTCGTCATT TTTTCCAGTT 840GTCAAGGGGA TGTTTCTCAT TTTCCATGAG TGTCAGTTTT CTTGCTATAT TCCATGTCCT 900ACAGTGACAT TTCTAAATAT TATACCTTTT TCAGTTTTTC TCACCATATT TCACGTCCTA 960AAGTATATAT TTCTCATTTT CCCTGATTTT CAGTTTCCTT GCCATATTCC AGGTCCTACA 1020GTGTGCATTT CTCATTTTTC ACGTTTTTCA GTAATTTCTT CATTTTTTAA GCCCTCAAAT 1080GGATGTTTCT CATTTTCCAT GATTTTCAGT TTTCTTGCCA TATACCATGT CCTACAGTGG 1140ACATTTCTAA ATTATCCACC TTTTTCAGTT TTTCATCGGC ACATTTCACG TCCTAAAGTG 1200TGTATTTCTA ATTTTCAGTG ATTTTCAGTT TTCTCGCCAT ATTCCAGGAC CTACAGTGTG 1260CATTTCTCAT TTTTCACGTT TTTCAGTGAA TTC 1293 (2) INFORMATION FOR SEQ ID NO:2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1044 base pairs (B) TYPE:nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULETYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v)FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi)SEQUENCE DESCRIPTION: SEQ ID NO: 2: AGGCCTATGG TGAAAAAGGA AATATCTTCCCCTGAAAACT AGACAGAAGG ATTCTCAGAA 60 TCTTATTTGT GATGTGCGCC CCTCAACTAACAGTGTTGAA GCTTTCTTTT GATAGAGCAG 120 TTTTGAAACA CTCTTTTTGT AAAATCTGCAAGAGGATATT TGGATAGCTT TGAGGATTTC 180 CGTTGGAAAC GGGATTGTCT TCATATAAACCCTAGACAGA AGCATTCTCA GAAGCTTCAT 240 TGGGATGTTT CAGTTGAAGT CACAGTGTTGAACAGTCCCC TTTCATAGAG CAGGTTTGAA 300 ACACTCTTTT TTGTAGTATC TGGAAGTGGACATTTGGAGC GATCTCAGGA CTGCGGTGAA 360 AAAGGAAATA TCTTCCAATA AAAGCTAGATAGAGGCAATG TCAGAAACCT TTTTCATGAT 420 GTATCTACTC AGCTAACAGA GTTGAACCTTCCTTTGAGAG AGCAGTTTTG AAACACTCTT 480 TTTGTGGAAT CTGCAAGTGG ATATTTGTCTAGCTTTGAGG ATTTCGTTGG GAAACGGGAT 540 TACATATAAA AAGCAGACAG CAGCATTCCCAGAAACTTCT TTGTGATGTT TGCATTCAAG 600 TCACAGAGTT GAACATTCCC TTTCATAGAGCAGGTTTGAA ACACACTTTT TGATGTATCT 660 GGATGTGGAC ATTTGCAGCG CTTTCAGGCCTAAGGTGAAA AGGAAATATC TTCCCCTGAA 720 AACTAGACAG AAGCATTCTC AGAAACTTATTTGTGATGTG CGCCCTCAAC TAACAGTGTT 780 GAAGCTTTCT TTTGATAGAG GCAGTTTTGAAACACTCTTT TGTGGAATCT GCAAGTGGAT 840 ATTTGTCTAG CTTTGAGGAT TTCTTTGGAAACGGGATTAC ATATAAAAAG CAGACAGCAG 900 CATTCCCAGA ATCTTGTTTG TGATGTTTGCATTCAAGTCA CAGAGTTGAA CATTCCCTTT 960 CAGAGAGCAG GTTTGAACAC TCTTTTTATAGTATCTGGAT GTGGACATTT GGAGCGCTTT 1020 CAGGGGGGAT CCTCTAGAAT TCCT 1044(2) INFORMATION FOR SEQ ID NO: 3: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 2492 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:CTGCAGCTGG GGGTCTCCAA TCAGGCAGGG GCCCCTTACT ACTCAGATGG GGTGGCCGAG 60TAGGGGAAGG GGGTGCAGGC TGCATGAGTG GACACAGCTG TAGGACTACC TGGGGGCTGT 120GGATCTATGG GGGTGGGGAG AAGCCCAGTG ACAGTGCCTA GAAGAGACAA GGTGGCCTGA 180GAGGGTCTGA GGAACATAGA GCTGGCCATG TTGGGGCCAG GTCTCAAGCA GGAAGTGAGG 240AATGGGACAG GCTTGAGGAT ACTCTACTCA GTAGCCAGGA TAGCAAGGAG GGCTTGGGGT 300TGCTATCCTG GGGTTCAACC CCCCAGGTTG AAGGCCCTGG GGGAGATGGT CCCAGGACAT 360ATTACAATGG ACACAGGAGG TTGGGACACC TGGAGTCACC AAACAAAACC ATGCCAAGAG 420AGACCATGAG TAGGGGTGTC CAGTCCAGCC CTCTGACTGA GCTGCATTGT TCAAATCCAA 480AGGGCCCCTG CTGCCACCTA GTGGCTGATG GCATCCACAT GACCCTGGGC CACACGCGTT 540TAGGGTCTCT GTGAAGACCA AGATCCTTGT TACATTGAAC GACTCCTAAA TGAGCAGAGA 600TTTCCACCTA TTCGAAACAA TCACATAAAA TCCATCCTGG AAAAAGCCTG GGGGATGGCA 660CTAAGGCTAG GGATAGGGTG GGATGAAGAT TATAGTTACA GTAAGGGGTT TAGGGTTAGG 720GATCAACGTT GGTTAGGAGT TAGGGATACA GTAGGGTACC GGTAGGGTTA GGGGTTAGGG 780TTAGGGGTTA GGGTTAGGGT TAGGGTTAGG GTTAGGGTTA GGGGTTAGGG GTTAGGGTTA 840GGGTTAGGTT TTGGGGTGGC GTATTTTGGT CTTATACGCT GTGTTCCACT GGCAATGAAA 900AGAGTTCTTG TTTTTCCTTC AGCAATTTGT CATTTTTAAA AGAGTTTAGC AATTCTAACA 960GATATAGACC AGCTGTGCTA TCTCATTGTG GTTTTCAATT GTAACCACAT TGTGGTTTCA 1020ATGTGTTTAC TTGCCATCTG TAGATCTTCT TTGCGTGAGG TGTCTGTTCA GATGTGTGTG 1080CATTTCTTGN NTTTNGGCTG TTTAACTTAT TGTTTAGTTT TAATAATTTT TTATATATTT 1140GAAGACAAAT CTTTCTCAGA TGTGTATTTG CAAATATTTC TTCAATATGA GGCTTGCTTT 1200TGTCTCTAAC AAGGTCTCTT CAGAGATAAC TTAAATATAA GAAATCCACA CTGTCACTTC 1260TTTTGTGTAT ATCTACCTTT TGTGTCATTT GTTAAAATTC ATTACCAAAC CCAAAGGCAG 1320ATAGCTTTTC TTCTATTGTT TCTTCTAGAA ATTTGTATAG TTTTGCATTT TTAGTGTAAG 1380GATGATTTTG AGTGATTATT TGTGTAAGTT GTAAAGTTTT CGTCTATATC CATATCATTT 1440CTTATGGTTT CCAATTAATC GTTCCCTCAC TATTTTTGGG AAAGACACAG GATAGTGGGC 1500TTTGTTAGAG TAGATAGGTA GCTAGACATG AACAGGAGGG GGCCTCCTGG AAAAGGGAAA 1560GTCTGGGAAG GCTCACCTGG AGGACCACCA AAAATTCACA TATTAGTAGC ATCTCTAGTG 1620CTGGAGTGGA TGGGCACTTG TCAATTGTGG GTAGGAGGGA AAAGAGGTCC TATGCAGAAA 1680GAAACTCCCT AGAACTCCTC TGAAGATGCC CCAATCATTC ACTCTGCAAT AAAAATGTCA 1740GAATATTGCT AGCTACATGC TGATAAGGNN AAAGGGGACA TTCTTAAGTG AAACCTGGCA 1800CCATAAGTAC AGATTAGGGC AGAGAAGGAC ATTCAAAAGA GGCAGGCGCA GTAGGTACAA 1860ACGTGATCGC TGTCAGTGTG CCTGGGATGG CGGGAAGGAG GCTGGTGCCA GAGTGGATTC 1920GTATTGATCA CCACACATAT ACCTCAACCA ACAGTGAGGA GGTCCCACAA GCCTAAGTGG 1980GGCAAGTTGG GGAGCTAAGG CAGTAGCAGG AAAACCAGAC AAAGAAAACA GGTGGAGACT 2040TGAGACAGAG GCAGGAATGT GAAGAAATCC AAAATAAAAT TCCCTGCACA GGACTCTTAG 2100GCTGTTTAAT GCATCGCTCA GTCCCACTCC TCCCTATTTT TCTACAATAA ACTCTTTACA 2160CTGTGTTTCT TTTCAATGAA GTTATCTGCC ATCTTTGTAT TGCCTCTTGG TGAAAATGTT 2220TCTTCCAAGT TAAACAAGAA CTGGGACATC AGCTCTCCCC AGTAATAGCT CCGTTTCAGT 2280TTGAATTTAC AGAACTGATG GGCTTAATAA CTGGCGCTCT GACTTTAGTG GTGCAGGAGG 2340CCGTCACACC GGGACCAAGA GTGCCCTGCC TAGTCCCCAT CTGCCCGCAG GTGGCGGCTG 2400CCTCGACACT GACAGCAATA GGGTCCGGCA GTGTCCCCAG CTGCCAGCAG GGGGCGTACG 2460ACGACTACAC TGTGAGCAAG AGGGCCCTGC AG 2492 (2) INFORMATION FOR SEQ ID NO:4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 28 base pairs (B) TYPE:nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULETYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v)FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi)SEQUENCE DESCRIPTION: SEQ ID NO: 4: GGGGAATTCA TTGGGATGTT TCAGTTGA 28(2) INFORMATION FOR SEQ ID NO: 5: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 29 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:CGAAAGTCCC CCCTAGGAGA TCTTAAGGA 29 (2) INFORMATION FOR SEQ ID NO: 6: (i)SEQUENCE CHARACTERISTICS: (A) LENGTH: 47 base pairs (B) TYPE: nucleicacid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE:DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION:SEQ ID NO: 6: CGCTTAATA CTCTGATGAG TCCGTGAGGA CGAAACGCTC TCGCACC 47 (2)INFORMATION FOR SEQ ID NO: 7: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:25 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: GATTTAAATTAATTAAGCC CGGGC 25 (2) INFORMATION FOR SEQ ID NO: 8: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 27 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION:SEQ ID NO: 8: AAATTTAAT TAATTCGGGC CCGTCGA 27 (2) INFORMATION FOR SEQ IDNO: 9: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 69 base pairs (B) TYPE:nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULETYPE: Genomic DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: ATG TAC AGGATG CAA CTC CTG TCT TGC ATT GCA CTA AGT CTT GCA CTT 48 Met Tyr Arg MetGln Leu Leu Ser Cys Ile Ala Leu Ser Leu Ala Leu GTC ACA AAC AGT GCA CCTACT 69 Val Thr Asn Ser Ala Pro Thr (2) INFORMATION FOR SEQ ID NO: 10:(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 945 base pairs (B) TYPE:nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULETYPE: cDNA (vi) ORIGINAL SOURCE: (ix) FEATURE: (A) NAME/KEY: CodingSequence (B) LOCATION: 1...942 (D) OTHER INFORMATION: Renilla ReinformisLuciferase (x) PUBLICATION INFORMATION: (H) DOCUMENT NUMBER: 5,418,155(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: AGC TTA AAG ATG ACT TCG AAAGTT TAT GAT CCA GAA CAA AGG AAA CGG 48 Ser Leu Lys Met Thr Ser Lys ValTyr Asp Pro Glu Gln Arg Lys Arg 1 5 10 15 ATG ATA ACT GGT CCG CAG TGGTGG GCC AGA TGT AAA CAA ATG AAT GTT 96 Met Ile Thr Gly Pro Gln Trp TrpAla Arg Cys Lys Gln Met Asn Val 20 25 30 CTT GAT TCA TTT ATT AAT TAT TATGAT TCA GAA AAA CAT GCA GAA AAT 144 Leu Asp Ser Phe Ile Asn Tyr Tyr AspSer Glu Lys His Ala Glu Asn 35 40 45 GCT GTT ATT TTT TTA CAT GGT AAC GCGGCC TCT TCT TAT TTA TGG CGA 192 Ala Val Ile Phe Leu His Gly Asn Ala AlaSer Ser Tyr Leu Trp Arg 50 55 60 CAT GTT GTG CCA CAT ATT GAG CCA GTA GCGCGG TGT ATT ATA CCA GAT 240 His Val Val Pro His Ile Glu Pro Val Ala ArgCys Ile Ile Pro Asp 65 70 75 80 CTT ATT GGT ATG GGC AAA TCA GGC AAA TCTGGT AAT GGT TCT TAT AGG 288 Leu Ile Gly Met Gly Lys Ser Gly Lys Ser GlyAsn Gly Ser Tyr Arg 85 90 95 TTA CTT GAT CAT TAC AAA TAT CTT ACT GCA TGGTTG AAC TTC TTA ATT 336 Leu Leu Asp His Tyr Lys Tyr Leu Thr Ala Trp LeuAsn Phe Leu Ile 100 105 110 TAC CAA AGA AGA TCA TTT TTT GTC GGC CAT GATTGG GGT GCT TGT TTG 384 Tyr Gln Arg Arg Ser Phe Phe Val Gly His Asp TrpGly Ala Cys Leu 115 120 125 GCA TTT CAT TAT AGC TAT GAG CAT CAA GAT AAGATC AAA GCA ATA GTT 432 Ala Phe His Tyr Ser Tyr Glu His Gln Asp Lys IleLys Ala Ile Val 130 135 140 CAC GCT GAA AGT GTA GTA GAT GTG ATT GAA TCATGG GAT GAA TGG CCT 480 His Ala Glu Ser Val Val Asp Val Ile Glu Ser TrpAsp Glu Trp Pro 145 150 155 160 GAT ATT GAA GAA GAT ATT GCG TTG ATC AAATCT GAA GAA GGA GAA AAA 528 Asp Ile Glu Glu Asp Ile Ala Leu Ile Lys SerGlu Glu Gly Glu Lys 165 170 175 ATG GTT TTG GAG AAT AAC TTC TTC GTG GAAACC ATG TTG CCA TCA AAA 576 Met Val Leu Glu Asn Asn Phe Phe Val Glu ThrMet Leu Pro Ser Lys 180 185 190 ATC ATG AGA AAG TTA GAA CCA GAA GAA TTTGCA GCA TAT CTT GAA CCA 624 Ile Met Arg Lys Leu Glu Pro Glu Glu Phe AlaAla Tyr Leu Glu Pro 195 200 205 TTC AAA GAG AAA GGT GAA GTT CGT CGT CCAACA TTA TCA TGG CCT CGT 672 Phe Lys Glu Lys Gly Glu Val Arg Arg Pro ThrLeu Ser Trp Pro Arg 210 215 220 GAA ATC CCG TTA GTA AAA GGT GGT AAA CCTGAC GTT GTA CAA ATT GTT 720 Glu Ile Pro Leu Val Lys Gly Gly Lys Pro AspVal Val Gln Ile Val 225 230 235 240 AGG AAT TAT AAT GCT TAT CTA CGT GCAAGT GAT GAT TTA CCA AAA ATG 768 Arg Asn Tyr Asn Ala Tyr Leu Arg Ala SerAsp Asp Leu Pro Lys Met 245 250 255 TTT ATT GAA TCG GAT CCA GGA TTC TTTTCC AAT GCT ATT GTT GAA GGC 816 Phe Ile Glu Ser Asp Pro Gly Phe Phe SerAsn Ala Ile Val Glu Gly 260 265 270 GCC AAG AAG TTT CCT AAT ACT GAA TTTGTC AAA GTA AAA GGT CTT CAT 864 Ala Lys Lys Phe Pro Asn Thr Glu Phe ValLys Val Lys Gly Leu His 275 280 285 TTT TCG CAA GAA GAT GCA CCT GAT GAAATG GGA AAA TAT ATC AAA TCG 912 Phe Ser Gln Glu Asp Ala Pro Asp Glu MetGly Lys Tyr Ile Lys Ser 290 295 300 TTC GTT GAG CGA GTT CTC AAA AAT GAACAA TAA 945 Phe Val Glu Arg Val Leu Lys Asn Glu Gln 305 310 (2)INFORMATION FOR SEQ ID NO: 11: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:30 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: TTTGAATTC ATGTACAGGAT GCAACTCCTG 30 (2) INFORMATION FOR SEQ ID NO: 12: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION:SEQ ID NO: 12: TTTGAATTCA GTAGGTGCAC TGTTTGTCAC 30 (2) INFORMATION FORSEQ ID NO: 13: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1434 base pairs(B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear(ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE:NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 13: CCTCCACGCA CGTTGTGATA TGTAGATGAT AATCATTATCAGAGCAGCGT TGGGGGATAA 60 TGTCGACATT TCCACTCCCA ATGACGGTGA TGTATAATGCTCAAGTATTC TCCTGCTTTT 120 TTACCACTAA CTAGGAACTG GGTTTGGCCT TAATTCAGACAGCCTTGGCT CTGTCTGGAC 180 AGGTCCAGAC GACTGACACC ATTAACACTT TGTCAGCCTCAGTGACTACA GTCATAGATG 240 AACAGGCCTC AGCTAATGTC AAGATACAGA GAGGTCTCATGCTGGTTAAT CAACTCATAG 300 ATCTTGTCCA GATACAACTA GATGTATTAT GACAAATAACTCAGCAGGGA TGTGAACAAA 360 AGTTTCCGGG ATTGTGTGTT ATTTCCATTC AGTATGTTAAATTTACTAGG ACAGCTAATT 420 TGTCAAAAAG TCTTTTTCAG TATATGTTAC AGAATTGGATGGCTGAATTT GAACAGATCC 480 TTCGGGAATT GAGACTTCAG GTCAACTCCA CGCGCTTGGACCTGTCGCTG ACCAAAGGAT 540 TACCCAATTG GATCTCCTCA GCATTTTCTT TCTTTAAAAAATGGGTGGGA TTAATATTAT 600 TTGGAGATAC ACTTTGCTGT GGATTAGTGT TGCTTCTTTGATTGGTCTGT AAGCTTAAGG 660 CCCAAACTAG GAGAGACAAG GTGGTTATTG CCCAGGCGCTTGCAGGACTA GAACATGGAG 720 CTTCCCCTGA TATATGGTTA TCTATGCTTA GGCAATAGGTCGCTGGCCAC TCAGCTCTTA 780 TATCCCACGA GGCTAGTCTC ATTGTACGGG ATAGAGTGAGTGTGCTTCAG CAGCCCGAGA 840 GAGTTGCAAG GCTAAGCACT GCAATGGAAA GGCTCTGCGGCATATATGTG CCTATTCTAG 900 GGGGACATGT CATCTTTCAT GAAGGTTCAG TGTCCTAGTTCCCTTCCCCC AGGCAAAACG 960 ACACGGGAGC AGGTCAGGGT TGCTCTGGGT AAAAGCCTGTGAGCCTGGGA GCTAATCCTG 1020 TACATGGCTC CTTTACCTAC ACACTGGGGA TTTGACCTCTATCTCCACTC TCATTAATAT 1080 GGGTGGCCTA TTTGCTCTTA TTAAAAGGAA AGGGGGAGATGTTGGGAGCC GCGCCCACAT 1140 TCGCCGTTAC AAGATGGCGC TGACAGCTGT GTTCTAAGTGGTAAACAAAT AATCTGCGCA 1200 TGTGCCGAGG GTGGTTCTTC ACTCCATGTG CTCTGCCTTCCCCGTGACGT CAACTCGGCC 1260 GATGGGCTGC AGCCAATCAG GGAGTGACAC GTCCTAGGCGAAGGAGAATT CTCCTTAATA 1320 GGGACGGGGT TTCGTTCTCT CTCTCTCTCT TGCTTCTCTCTCTTGCTTTT TCGCTCTCTT 1380 GCTTCCCGTA AAGTGATAAT GATTATCATC TACATATCACAACGTGCGTG GAGG 1434 (2) INFORMATION FOR SEQ ID NO: 14: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 1400 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: CCTCCACGCA CGTTGTGATA TGTAGATGAT AATCATTATC AGAGCAGCGT TGGGGGATAA 60TGTCGACATT TCCACTCCCA ATGACGGTGA TGTATAATGC TCAAGTATTC TCCTGCTTTT 120TTACCACTAA CTAGGAACTG GGTTTGGCCT TAATTCAGAC AGCCTTGGCT CTGTCTGGAC 180AGGTCCAGAT ACAACTAGAT GTATTATGAC AAATAACTCA GCAGGGATGT GAACAAAAGT 240TTCCGGGATT GCGTGTTATT TCCATCCAGT ATGTTAAATT TACTAGGGCA GCTAATTTGT 300CAAAAAGTCT TTTCCAGTAT ATGTTACAGA ATTGGATGGC TGAATTTGAA CAGATCCTTC 360GGGAATTGAG ACTTCAGGTC AACTCCACGC GCTTGGACCT GTCCCTGACC AAAGGATTAC 420CCAATTGGAT CTCCTCAGCA TTTTCTTTCT TTAAAAAATG GGTGGGATTA ATATTATTTG 480GAGATACACT TTGCTGTGGA TTAGTGTTGC TTCTTTGATT GGTCTGTAAG CTTAAGGCCC 540AAACTAGGAG AGACAAGGTG GTTATTGCCC AGGCGCTTGC AGGACTAGAA CATGGAGCTT 600CCCCTGATAT ATCTATGCTT AGGCAATAGG TCGCTGGCCA CTCAGCTCTT ATATCCCATG 660AGGCTAGTCT CATTGCACGG GATAGAGTGA GTGTGCTTCA GCAGCCCGAG AGAGTTGCAC 720GGCTAAGCAC TGCAATGGAA AGGCTCTGCG GCATATATGA GCCTATTCTA GGGAGACATG 780TCATCTTTCA AGAAGGTTGA GTGTCCAAGT GTCCTTCCTC CAGGCAAAAC GACACGGGAG 840CAGGTCAGGG TTGCTCTGGG TAAAAGCCTG TGAGCCTAAG AGCTAATCCT GTACATGGCT 900CCTTTACCTA CACACTGGGG ATTTGACCTC TATCTCCACT CTCATTAATA TGGGTGGCCT 960ATTTGCTCTT ATTAAAAGGA AAGGGGGAGA TGTTGGGAGC CGCGCCCACA TTCGCCGTTA 1020CAAGATGGCG CTGACAGCTG TGTTCTAAGT GGTAAACAAA TAATCTGCGC ATGCGCCGAG 1080GGTGGTTCTT CACTCCATGT GCTCTGCCTT CCCCGTGACG TCAACTCGGC CGATGGGCTG 1140CAGTCAATCA GGGAGTGACA CGTCCTAGGC GAAGGAAAAT TCTCCTTAAT AGGGACGGGG 1200TTTCGTTTTC TCTCTCTCTT GCTTCGCTCT CTCTTGCTTC TTGCTCTCTT TTCCTGAAGA 1260TGTAAGAATA AAGCTTTGCC GCAGAAGATT CTGGTCTGTG GTGTTCTTCC TGGCCGGTCG 1320TGAGAACGCG TCTAATAACA ATTGGTGCCG AAACCCGGGT GATAATGATT ATCATCTACA 1380TATCACAACG TGCGTGGAGG 1400 (2) INFORMATION FOR SEQ ID NO: 15: (i)SEQUENCE CHARACTERISTICS: (A) LENGTH: 1369 base pairs (B) TYPE: nucleicacid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE:Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENTTYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ IDNO: 15: CCTCCACGCA CGTTGTGATA TGTAGATGAT AATCATTATC ACTTTACGGGTCCTTTCACT 60 ACAACTGCCA CGAGGCCCCG TGCTCTGGTA ATAGATCTTT GCTGAAAAGGCACACACATG 120 ACACATTACT CAAGGTGGGC TCATCTGAGC TGCAGATTCA GCTTAATATGAATCTTGCCA 180 ATTGTGTGAA ATCATAAATC TTCAAAGTGA CACTCATTGC CAGACACAGGTGCCCACCTT 240 TGGCATAATA AACAAACACA AATTATCTAT TATATAAAGG GTGTTAGAAGATGCTTTAGA 300 ATACAAATAA ATCATGGTAG ATAACAGTAA GTTGAGAGCT TAAATTTAATAAAGTGATAT 360 ACCTAATAAA AATTAAATTA AGAAGGTGTG AATATACTAC AGTAGGTAAATTATTTCATT 420 AATTTATTTT CTTTCTTAAT CCTTTATAAT GTTTTCTGCT ATTGTCAATTGCACATCCAT 480 ATGTTCAATT CTTCACTGTA ATGAAGAAAT GTAGTAAATA TACTTTCCGAACAAGTTGTA 540 TCAAATATGT TACACTTGAT TCCGTGTGTT ACTTATCATT TTATTATTATATTGATTGCA 600 TTCCTTCGTT ACTTGATATT ATTACAAGGT ACATATTTAT TCTCTCAGATCTTCATTATA 660 CTCTAACCAT TTTATAACAT ACTTTATTTA TTCATTTCTT ATGTGTGCTGTGAGGCACAA 720 ATGCCAGAGA GAACTTGAGC AGATAAGAGG ACAAATTGCA AGAGTCAGTTACCTCCTGCT 780 GTTCCTTGGA AACTCAGGAT CAAATTCAGG TTGTCAGGCT TGGCAGCATGCACTTTTTAC 840 CAGTGCCTCC ATCTTGCTAG CCCTGAACAT CAAGCTTTGC AGACAGACAGGCTACACTAA 900 GTGAACTGGT CATTCACAGC ATGCATGGTG ATTTATTGTT ACTTTCTATTCCATGCCTTT 960 ACTATTTCTA CTAGGTGCTA GCTAGTACTG TATTTCGAGA TAGAAGTTACTGAAAGAAAA 1020 TTACATTGTT TTCTATAGAT CCTTGATACT CTTTCAGCAG ATATAGAGTTTTAATCAGGT 1080 CCTAGACCCT TTCTTCACTC TTATTAAATA CTAAGTACAA ATTAAGTTTATCCAAAACAG 1140 TACGGATGTT GATTTTGTGC AGTTCTACTA TGATAATAGT CTAGCTTCATAAATCTGACA 1200 CACTTATTGG GAATGTTTTT GTTAATAAAA GATTCAGGTG TTACTCTAGGTCAAGAGAAT 1260 ATTAAACATC AGTCCCAAAT TACAAACTTC AATAAAAGAT TTGACTCTCCAGTGGTGGCA 1320 ATATAAAGTG ATAATGATTA TCATCTACAT ATCACAACGT GCGTGGAGG1369 (2) INFORMATION FOR SEQ ID NO: 16: (i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 22118 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS:single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii)HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi)ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: GAATTCCCCTATCCCTAATC CAGATTGGTG GAATAACTTG GTATAGATGT TTGTGCATTA 60 AAAACCCTGTAGGATCTTCA CTCTAGGTCA CTGTTCAGCA CTGGAACCTG AATTGTGGCC 120 CTGAGTGATAGGTCCTGGGA CATATGCAGT TCTGCACAGA CAGACAGACA GACAGACAGA 180 CAGACAGACAGACAGACGTT ACAAACAAAC ACGTTGAGCC GTGTGCCAAC ACACACACAA 240 ACACCACTCTGGCCATAATT ATTGAGGACG TTGATTTATT ATTCTGTGTT TGTGAGTCTG 300 TCTGTCTGTCTGTCTGTCTG TCTGTCTGTC TATCAAACCA AAAGAAACCA AACAATTATG 360 CCTGCCTGCCTGCCTGCCTG CCTACACAGA GAAATGATTT CTTCAATCAA TCTAAAACGA 420 CCTCCTAAGTTTGCCTTTTT TCTCTTTCTT TATCTTTTTC TTTTTTCTTT TCTTCTTCCT 480 TCCTTCCTTCCTTCCTTCCT TCCTTCCTTT CTTTCTTTCT TTCTTTCTTT CTTACTTTCT 540 TTCTTTCCTTCTTACATTTA TTCTTTTCAT ACATAGTTTC TTAGTGTAAG CATCCCTGAC 600 TGTCTTGAAGACACTTTGTA GGCCTCAATC CTGTAAGAGC CTTCCTCTGC TTTTCAAATG 660 CTGGCATGAATGTTGTACCT CACTATGACC AGCTTAGTCT TCAAGTCTGA GTTACTGGAA 720 AGGAGTTCCAAGAAGACTGG TTATATTTTT CATTTATTAT TGCATTTTAA TTAAAATTTA 780 ATTTCACCAAAAGAATTTAG ACTGACCAAT TCAGAGTCTG CCGTTTAAAA GCATAAGGAA 840 AAAGTAGGAGAAAAACGTGA GGCTGTCTGT GGATGGTCGA GGCTGCTTTA GGGAGCCTCG 900 TCACCATTCTGCACTTGCAA ACCGGGCCAC TAGAACCCGG TGAAGGGAGA AACCAAAGCG 960 ACCTGGAAACAATAGGTCAC ATGAAGGCCA GCCACCTCCA TCTTGTTGTG CGGGAGTTCA 1020 GTTAGCAGACAAGATGGCTG CCATGCACAT GTTGTCTTTC AGCTTGGTGA GGTCAAAGTA 1080 CAACCGAGTCACAGAACAAG GAAGTATACA CAGTGAGTTC CAGGTCAGCC AGAGTTTACA 1140 CAGAGAAACCACATCTTGAA AAAAACAAAA AAATAAATTA AATAAATATA ATTTAAAAAT 1200 TTAAAAATAGCCGGGAGTGA TGGCGCATGT CTTTAATCCC AGCTCTCTTC AGGCAGAGAT 1260 GGGAGGATTTCTGAGTTTGA GGCCAGCCTG GTCTGCAAAG TGAGTTCCAG GACAGTCAGG 1320 GCTATACAGAGAAACCCTGT CTTGAAAACT AAACTAAATT AAACTAAACT AAACTAAAAA 1380 AATATAAAATAAAAATTTTA AAGAATTTTA AAAAACTACA GAAATCAAAC ATAAGCCCAC 1440 GAGATGGCAAGTAACTGCAA TCATAGCAGA AATATTATAC ACACACACAC ACACAGACTC 1500 TGTCATAAAATCCAATGTGC CTTCATGATG ATCAAATTTC GATAGTCAGT AATACTAGAA 1560 GAATCATATGTCTGAAAATA AAAGCCAGAA CCTTTTCTGC TTTTGTTTTC TTTTGCCCCA 1620 AGATAGGGTTTCTCTCAGTG TATCCCTGGC ATCCCTGCCT GGAACTTCCT TTGTAGGTTT 1680 GGTAGCCTCAAACTCAGAGA GGTCCTCTCT GCCTGCCTGC CTGCCTGCCT GCCTGCCTGC 1740 CTGCCTGCCTGCCTGCCTCA CTTCTTCTGC CACCCACACA ACCGAGTCGA ACCTAGGATC 1800 TTTATTTCTTTCTCTTTCTC TCTTCTTTCT TTCTTTCTTT CTTTCTTTCT TTCTTTCTTT 1860 CTTTCTTTCTTTCTTATTCA ATTAGTTTTC AATGTAAGTG TGTGTTTGTG CTCTATCTGC 1920 TGCCTATAGGCCTGCTTGCC AGGAGAGGGC AACAGAACCT AGGAGAAACC ACCATGCAGC 1980 TCCTGAGAATAAGTGAAAAA ACAACAAAAA AAGGAAATTC TAATCACATA GAATGTAGAT 2040 ATATGCCGAGGCTGTCAGAG TGCTTTTTAA GGCTTAGTGT AAGTAATGAA AATTGTTGTG 2100 TGTCTTTTATCCAAACACAG AAGAGAGGTG GCTCGGCCTG CATGTCTGTT GTCTGCATGT 2160 AGACCAGGCTGGCCTTGAAC ACATTAATCT GTCTGCCTCT GCTTCCCTAA TGCTGCGATT 2220 AAAGGCATGTGCCACCACTG CCCGGACTGA TTTCTTCTTT TTTTTTTTTT TGGAAAATAC 2280 CTTTCTTTCTTTTTCTCTCT CTCTTTCTTC CTTCCTTCCT TTCTTTCTAT TCTTTTTTTC 2340 TTTCTTTTTTCTTTTTTTTT TTTTTTTTAA AATTTGCCTA AGGTTAAAGG TGTGCTCCAC 2400 AATTGCCTCAGCTCTGCTCT AATTCTCTTT AAAAAAAAAC AAACAAAAAA AAAACCAAAA 2460 CAGTATGTATGTATGTATAT TTAGAAGAAA TACTAATCCA TTAATAACTC TTTTTTCCTA 2520 AAATTCATGTCATTCTTGTT CCACAAAGTG AGTTCCAGGA CTTACCAGAG AAACCCTGTG 2580 TTCAAATTTCTGTGTTCAAG GTCACCCTGG CTTACAAAGT GAGTTCCAAG TCCGATAGGG 2640 CTACACAGAAAAACCATATC TCAGAAAAAA AAAAAGTTCC AAACACACAC ACACACACAC 2700 ACACACACACACACACACAC ACACACACAC ACACACACAG CGCGCCGCGG CGATGAGGGG 2760 AAGTCGTGCCTAAAATAAAT ATTTTTCTGG CCAAAGTGAA AGCAAATCAC TATGAAGAGG 2820 TACTCCTAGAAAAAATAAAT ACAAACGGGC TTTTTAATCA TTCCAGCACT GTTTTAATTT 2880 AACTCTGAATTTAGTCTTGG AAAAGGGGGC GGGTGTGGGT GAGTGAGGGC GAGCGAGCAG 2940 ACGGGCGGGCGGGCGGGTGA GTGGCCGGCG GCGGTGGCAG CGAGCACCAG AAAACAACAA 3000 ACCCCAAGCGGTAGAGTGTT TTAAAAATGA GACCTAAATG TGGTGGAACG GAGGTCGCCG 3060 CCACCCTCCTCTTCCACTGC TTAGATGCTC CCTTCCCCTT ACTGTGCTCC CTTCCCCTAA 3120 CTGTGCCTAACTGTGCCTGT TCCCTCACCC CGCTGATTCG CCAGCGACGT ACTTTGACTT 3180 CAAGAACGATTTTGCCTGTT TTCACCGCTC CCTGTCATAC TTTCGTTTTT GGGTGCCCGA 3240 GTCTAGCCCGTTCGCTATGT TCGGGCGGGA CGATGGGGAC CGTTTGTGCC ACTCGGGAGA 3300 AGTGGTGGGTGGGTACGCTG CTCCGTCGTG CGTGCGTGAG TGCCGGAACC TGAGCTCGGG 3360 AGACCCTCCGGAGAGACAGA ATGAGTGAGT GAATGTGGCG GCGCGTGACG GATCTGTATT 3420 GGTTTGTATGGTTGATCGAG ACCATTGTCG GGCGACACCT AGTGGTGACA AGTTTCGGGA 3480 ACGCTCCAGGCCTCTCAGGT TGGTGACACA GGAGAGGGAA GTGCCTGTGG TGAGGCGACC 3540 AGGGTGACAGGAGGCCGGGC AAGCAGGCGG GAGCGTCTCG GAGATGGTGT CGTGTTTAGG 3600 GACGGTCTCTAACAAGGAGG TCGTACAGGG AGATGGCCAA AGCAGACCGA GTTGCTGTAC 3660 GCCCTTTTGGGAAAAATGCT AGGGTTGGTG GCAACGTTAC TAGGTCGACC AGAAGGCTTA 3720 AGTCCTACCCCCCCCCCCCT TTTTTTTTTT TTTCCTCCAG AAGCCCTCTC TTGTCCCCGT 3780 CACCGGGGGCACCGTACATC TGAGGCCGAG AGGACGCGAT GGGCCCGGCT TCCAAGCCGG 3840 TGTGGCTCGGCCAGCTGGCG CTTCGGGTCT TTTTTTTTTT TTTTTTTTTT TTTTCCTCCA 3900 GAAGCCTTGTCTGTCGCTGT CACCGGGGGC GCTGTACTTC TGAGGCCGAG AGGACGCGAT 3960 GGGCCCCGGCTTCCAAGCCG GTGTGGCTCG GCCAGCTGGA GCTTCGGGTC TTTTTTTTTT 4020 TTTTTTTTTTTTTTTTTCTC CAGAAGCCTT GTCTGTCGCT GTCACCGGGG GCGCTGTACT 4080 TCTGAGGCCGAGAGGACGCG ATGGGTCGGC TTCCAAGCCG ATGTGGCGGG GCCAGCTGGA 4140 GCTTCGGGTTTTTTTTTTTC CTCCAGAAGC CCTCTCTTGT CCCCGTCACC GGGGGCGCTG 4200 TACTTCTGAGGCCGAGAGGA CGTGATGGGC CCGGGTTCCA GGCGGATGTC GCCCGGTCAG 4260 CTGGAGCTTTGGATCTTTTT TTTTTTTTTT CCTCCAGAAG CCCTCTCTTG TCCCCGTCAC 4320 CGGGGGCACCTTACATCTGA GGGCGAGAGG ACGTGATGGG TCCGGCTTCC AAGCCGATGT 4380 GGCGGGGCCAGCTGGAGCTT CGGGTTTTTT TTTTTTCCTC CAGAAGCCCT CTCTTGTCCC 4440 CGTCACCGGGGGCGCTGTAC TTCTGAGGCC GAGAGGACGT GATGGGCCCG GGTTCCAGGC 4500 GGATGTCGCCCGGTCAGCTG GAGCTTTGGA TCATTTTTTT TTTTCCCTCC AGAAGCCCTC 4560 TCTTGTCCCCGTCACCGGGG GCACCGTACA TCTGAGGCCG AGAGGACACG ATGGGCCTGT 4620 CTTCCAAGCCGATGTGGCCC GGCCAGCTGG AGCTTCGGGT CTTTTTTTTT TTTTTTCCTC 4680 CAGAAGCCTTGTCTGTCGCT GTCACCCGGG GCGCTGTACT TCTGAGGCCG AGAGGACGCG 4740 ATGGGCCCGGCTTCCAAGCC GGTGTGGCTC GGCCAGCTGG AGCTTCGGGT CTTTTTTTTT 4800 TTTTTTTTTTTTCCTCCAGA AACCTTGTCT GTCGCTGTCA CCCGGGGCGC TTGTACTTCT 4860 GATGCCGAGAGGACGCGATG GGCCCGTCTT CCAGGCCGAT GTGGCCCGGT CAGCTGGAGC 4920 TTTGGATCTTTTTTTTTTTT TTTTCCTCCA GAAGCCCTCT CTTGTCCCCG TCACCGGGGG 4980 CACCTTACATCTGAGGCCTA GAGGACACGA TGGGCCCGGG TTCCAGGCCG ATGTGGCCCG 5040 GTCAGCTGGAGCTTTGGATC TTTTTTTTTT TTTTCTTCCA GAAGCCCTCT TGTCCCCGTC 5100 ACCGGTGGCACTGTACATCT GAGGCGGAGA GGACATTATG GGCCCGGCTT CCAATCCGAT 5160 GTGGCCCGGTCAGCTGGAGC TTTGGATCTT ATTTTTTTTT TAATTTTTTC TTCCAGAAGC 5220 CCTCTTGTCCCTGTCACCGG TGGCACGGTA CATCTGAGGC CGAGAGGACA TTATGGGCCC 5280 GGCTTCCAGGCCGATGTGGC CCGGTCAGCT GGAGCTTTGG ATCTTTTTTT TTTTTTTTCT 5340 TTTTTCCTCCAGAAGCCCTC TCTGTCCCTG TCACCGGGGG CCCTGTACGT CTGAGGCCGA 5400 GGGAAAGCTATGGGCGCGGT TTTCTTTCAT TGACCTGTCG GTCTTATCAG TTCTCCGGGT 5460 TGTCAGGGTCGACCAGTTGT TCCTTTGAGG TCCGGTTCTT TTCGTTATGG GGTCATTTTT 5520 GGGCCACCTCCCCAGGTATG ACTTCCAGGC GTCGTTGCTC GCCTGTCACT TTCCTCCCTG 5580 TCTCTTTTATGCTTGTGATC TTTTCTATCT GTTCCTATTG GACCTGGAGA TAGGTACTGA 5640 CACGCTGTCCTTTCCCTATT AACACTAAAG GACACTATAA AGAGACCCTT TCGATTTAAG 5700 GCTGTTTTGCTTGTCCAGCC TATTCTTTTT ACTGGCTTGG GTCTGTCGCG GTGCCTGAAG 5760 CTGTCCCCGAGCCACGCTTC CTGCTTTCCC GGGCTTGCTG CTTGCGTGTG CTTGCTGTGG 5820 GCAGCTTGTGACAACTGGGC GCTGTGACTT TGCTGCGTGT CAGACGTTTT TCCCGATTTC 5880 CCCGAGGTGTCGTTGTCACA CCTGTCCCGG TTGGAATGGT GGAGCCAGCT GTGGTTGAGG 5940 GCCACCTTATTTCGGCTCAC TTTTTTTTTT TTTTTTTCTC TTGGAGTCCC GAACCTCCGC 6000 TCTTTTCTCTTCCCGGTCTT TCTTCCACAT GCCTCCCGAG TGCATTTCTT TTTGTTTTTT 6060 TTCTTTTTTTTTTTTTTTTT TTGGGGAGGT GGAGAGTCCC GAGTACTTCA CTCCTGTCTG 6120 TGGTGTCCAAGTGTTCATGC CACGTGCCTC CCGAGTGCAC TTTTTTTTGT GGCAGTCGCT 6180 CGTTGTGTTCTCTTGTTCTG TGTCTGCCCG TATCAGTAAC TGTCTTGCCC CGCGTGTAAG 6240 ACATTCCTATCTCGCTTGTT TCTCCCGATT GCGCGTCGTT GCTCACTCTT AGATCGATGT 6300 GGTGCTCCGGAGTTCTCTTC GGGCCAGGGC CAAGCCGCGC CAGGCGAGGG ACGGACATTC 6360 ATGGCGAATGGCGGCCGCTC TTCTCGTTCT GCCAGCGGGC CCTCGTCTCT CCACCCCATC 6420 CGTCTGCCGGTGGTGTGTGG AAGGCAGGGG TGCGGCTCTC CGGCCCGACG CTGCCCCGCG 6480 CGCACTTTTCTCAGTGGTTC GCGTGGTCCT TGTGGATGTG TGAGGCGCCC GGTTGTGCCC 6540 TCACGTGTTTCACTTTGGTC GTGTCTCGCT TGACCATGTT CCCAGAGTCG GTGGATGTGG 6600 CCGGTGGCGTTGCATACCCT TCCCGTCTGG TGTGTGCACG CGCTGTTTCT TGTAAGCGTC 6660 GAGGTGCTCCTGGAGCGTTC CAGGTTTGTC TCCTAGGTGC CTGCTTCTGA GCTGGTGGTG 6720 GCGCTCCCCATTCCCTGGTG TGCCTCCGGT GCTCCGTCTG GCTGTGTGCC TTCCCGTTTG 6780 TGTCTGAGAAGCCCGTGAGA GGGGGGTCGA GGAGAGAAGG AGGGGCAAGA CCCCCCTTCT 6840 TCGTCGGGTGAGGCGCCCAC CCCGCGACTA GTACGCCTGT GCGTAGGGCT GGTGCTGAGC 6900 GGTCGCGGCTGGGGTTGGAA AGTTTCTCGA GAGACTCATT GCTTTCCCGT GGGGAGCTTT 6960 GAGAGGCCTGGCTTTCGGGG GGGACCGGTT GCAGGGTCTC CCCTGTCCGC GGATGCTCAG 7020 AATGCCCTTGGAAGAGAACC TTCCTGTTGC CGCAGACCCC CCCGCGCGGT CGCCCGCGTG 7080 TTGGTCTTCTGGTTTCCCTG TGTGCTCGTC GCATGCATCC TCTCTCGGTG GCCGGGGCTC 7140 GTCGGGGTTTTGGGTCCGTC CCGCCCTCAG TGAGAAAGTT TCCTTCTCTA GCTATCTTCC 7200 GGAAAGGGTGCGGGCTTCTT ACGGTCTCGA GGGGTCTCTC CCGAATGGTC CCCTGGAGGG 7260 CTCGCCCCCTGACCGCCTCC CGCGCGCGCA GCGTTTGCTC TCTCGTCTAC CGCGGCCCGC 7320 GGCCTCCCCGCTCCGAGTTC GGGGAGGGAT CACGCGGGGC AGAGCCTGTC TGTCGTCCTG 7380 CCGTTGCTGCGGAGCATGTG GCTCGGCTTG TGTGGTTGGT GGCTGGGGAG AGGGCTCCGT 7440 GCACACCCCCGCGTGCGCGT ACTTTCCTCC CCTCCTGAGG GCCGCCGTGC GGACGGGGTG 7500 TGGGTAGGCGACGGTGGGCT CCCGGGTCCC CACCCGTCTT CCCGTGCCTC ACCCGTGCCT 7560 TCCGTCGCGTGCGTCCCTCT CGCTCGCGTC CACGACTTTG GCCGCTCCCG CGACGGCGGC 7620 CTGCGCCGCGCGTGGTGCGT GCTGTGTGCT TCTCGGGCTG TGTGGTTGTG TCGCCTCGCC 7680 CCCCCCTTCCCGCGGCAGCG TTCCCACGGC TGGCGAAATC GCGGGAGTCC TCCTTCCCCT 7740 CCTCGGGGTCGAGAGGGTCC GTGTCTGGCG TTGATTGATC TCGCTCTCGG GGACGGGACC 7800 GTTCTGTGGGAGAACGGCTG TTGGCCGCGT CCGGCGCGAC GTCGGACGTG GGGACCCACT 7860 GCCGCTCGGGGGTCTTCGTC GGTAGGCATC GGTGTGTCGG CATCGGTCTC TCTCTCGTGT 7920 CGGTGTCGCCTCCTCGGGCT CCCGGGGGGC CGTCGTGTTT CGGGTCGGCT CGGCGCTGCA 7980 GGTGTGGTGGGACTGCTCAG GGGAGTGGTG CAGTGTGATT CCCGCCGGTT TTGCCTCGCG 8040 TGCCCTGACCGGTCCGACGC CCGAGCGGTC TCTCGGTCCC TTGTGAGGAC CCCCTTCCGG 8100 GAGGGGCCCGTTTCGGCCGC CCTTGCCGTC GTCGCCGGCC CTCGTTCTGC TGTGTCGTTC 8160 CCCCCTCCCCGCTCGCCGCA GCCGGTCTTT TTTCCTCTCT CCCCCCCTCT CCTCTGACTG 8220 ACCCGTGGCCGTGCTGTCGG ACCCCCCGCA TGGGGGCGGC CGGGCACGTA CGCGTCCGGG 8280 CGGTCACCGGGGTCTTGGGG GGGGGCCGAG GGGTAAGAAA GTCGGCTCGG CGGGCGGGAG 8340 GAGCTGTGGTTTGGAGGGCG TCCCGGCCCC GCGGCCGTGG CGGTGTCTTG CGCGGTCTTG 8400 GAGAGGGCTGCGTGCGAGGG GAAAAGGTTG CCCCGCGAGG GCAAAGGGAA AGAGGCTAGC 8460 AGTGGTCATTGTCCCGACGG TGTGGTGGTC TGTTGGCCGA GGTGCGTCTG GGGGGCTCGT 8520 CCGGCCCTGTCGTCCGTCGG GAAGGCGCGT GTTGGGGCCT GCCGGAGTGC CGAGGTGGGT 8580 ACCCTGGCGGTGGGATTAAC CCCGCGCGCG TGTCCCGGTG TGGCGGTGGG GGCTCCGGTC 8640 GATGTCTACCTCCCTCTCCC CGAGGTCTCA GGCCTTCTCC GCGCGGGCTC TCGGCCCTCC 8700 CCTCGTTCCTCCCTCTCGCG GGGTTCAAGT CGCTCGTCGA CCTCCCCTCC TCCGTCCTTC 8760 CATCTCTCGCGCAATGGCGC CGCCCGAGTT CACGGTGGGT TCGTCCTCCG CCTCCGCTTC 8820 TCGCCGGGGGCTGGCCGCTG TCCGGTCTCT CCTGCCCGAC CCCCGTTGGC GTGGTCTTCT 8880 CTCGCCGGCTTCGCGGACTC CTGGCTTCGC CCGGAGGGTC AGGGGGCTTC CCGGTTCCCC 8940 GACGTTGCGCCTCGCTGCTG TGTGCTTGGG GGGGGCCCGC TGCGGCCTCC GCCCGCCCGT 9000 GAGCCCCTGCCGCACCCGCC GGTGTGCGGT TTCGCGCCGC GGTCAGTTGG GCCCTGGCGT 9060 TGTGTCGCGTCGGGAGCGTG TCCGCCTCGC GGCGGCTAGA CGCGGGTGTC GCCGGGCTCC 9120 GACGGGTGGCCTATCCAGGG CTCGCCCCCG CCGACCCCCG CCTGCCCGTC CCGGTGGTGG 9180 TCGTTGGTGTGGGGAGTGAA TGGTGCTACC GGTCATTCCC TCCCGCGTGG TTTGACTGTC 9240 TCGCCGGTGTCGCGCTTCTC TTTCCGCCAA CCCCCACGCC AACCCACCAC CCTGCTCTCC 9300 CGGCCCGGTGCGGTCGACGT TCCGGCTCTC CCGATGCCGA GGGGTTCGGG ATTTGTGCCG 9360 GGGACGGAGGGGAGAGCGGG TAAGAGAGGT GTCGGAGAGC TGTCCCGGGG CGACGCTCGG 9420 GTTGGCTTTGCCGCGTGCGT GTGCTCGCGG ACGGGTTTTG TCGGACCCCG ACGGGGTCGG 9480 TCCGGCCGCATGCACTCTCC CGTTCCGCGC GAGCGCCCGC CCGGCTCACC CCCGGTTTGT 9540 CCTCCCGCGAGGCTCTCCGC CGCCGCCGCC TCCTCCTCCT CTCTCGCGCT CTCTGTCCCG 9600 CCTGGTCCTGTCCCACCCCC GACGCTCCGC TCGCGCTTCC TTACCTGGTT GATCCTGCCA 9660 GGTAGCATATGCTTGTCTCA AAGATTAAGC CATGCATGTC TAAGTACGCA CGGCCGGTAC 9720 AGTGAAACTGCGAATGGCTC ATTAAATCAG TTATGGTTCC TTTGGTCGCT CGCTCCTCTC 9780 CTACTTGGATAACTGTGGTA ATTCTAGAGC TAATACATGC CGACGGGCGC TGACCCCCCT 9840 TCCCGGGGGGGGATGCGTGC ATTTATCAGA TCAAAACCAA CCCGGTGAGC TCCCTCCCGG 9900 CTCCGGCCGGGGGTCGGGCG CCGGCGGCTT GGTGACTCTA GATAACCTCG GGCCGATCGC 9960 ACGCCCCCCGTGGCGGCGAC GACCCATTCG AACGTCTGCC CTATCAACTT TCGATGGTAG 10020 TCGCCGTGCCTACCATGGTG ACCACGGGTG ACGGGGAATC AGGGTTCGAT TCCGGAGAGG 10080 GAGCCTGAGAAACGGCTACC ACATCCAAGG AAGGCAGCAG GCGCGCAAAT TACCCACTCC 10140 CGACCCGGGGAGGTAGTGAC GAAAAATAAC AATACAGGAC TCTTTCGAGG CCCTGTAATT 10200 GGAATGAGTCCACTTTAAAT CCTTTAACGA GGATCCATTG GAGGGCAAGT CTGGTGCCAG 10260 CAGCCGCGGTAATTCCAGCT CCAATAGCGT ATATTAAAGT TGCTGCAGTT AAAAAGCTCG 10320 TAGTTGGATCTTGGGAGCGG GCGGGCGGTC CGCCGCGAGG CGAGTCACCG CCCGTCCCCG 10380 CCCCTTGCCTCTCGGCGCCC CCTCGATGCT CTTAGCTGAG TGTCCCGCGG GGCCCGAAGC 10440 GTTTACTTTGAAAAAATTAG AGTGTTCAAA GCAGGCCCGA GCCGCCTGGA TACCGCAGCT 10500 AGGAATAATGGAATAGGACC GCGGTTCTAT TTTGTTGGTT TTCGGAACTG AGGCCATGAT 10560 TAAGAGGGACGGCCGGGGGC ATTCGTATTG CGCCGCTAGA GGTGAAATTC TTGGACCGGC 10620 GCAAGACGGACCAGAGCGAA AGCATTTGCC AAGAATGTTT TCATTAATCA AGAACGAAAG 10680 TCGGAGGTTCGAAGACGATC AGATACCGTC GTAGTTCCGA CCATAAACGA TGCCGACTGG 10740 CGATGCGGCGGCGTTATTCC CATGACCCGC CGGGCAGCTT CCGGGAAACC AAAGTCTTTG 10800 GGTTCCGGGGGGAGTATGGT TGCAAAGCTG AAACTTAAAG GAATTGACGG AAGGGCACCA 10860 CCAGGAGTGGGCCTGCGGCT TAATTTGACT CAACACGGGA AACCTCACCC GGCCCGGACA 10920 CGGACAGGATTGACAGATTG ATAGCTCTTT CTCGATTCCG TGGGTGGTGG TGCATGGCCG 10980 TTCTTAGTTGGTGGAGCGAT TTGTCTGGTT AATTCCGATA ACGAACGAGA CTCTGGCATG 11040 CTAACTAGTTACGCGACCCC CGAGCGGTCG GCGTCCCCCA ACTTCTTAGA GGGACAAGTG 11100 GCGTTCAGCCACCCGAGATT GAGCAATAAC AGGTCTGTGA TGCCCTTAGA TGTCCGGGGC 11160 TGCACGCGCGCTACACTGAC TGGCTCAGCG TGTGCCTACC CTGCGCCGGC AGGCGCGGGT 11220 AACCCGTTGAACCCCATTCG TGATGGGGAT CGGGGATTGC AATTATTCCC CATGAACGAG 11280 GAATTCCCAGTAAGTGCGGG TCATAAGCTT GCGTTGATTA AGTCCCTGCC CTTTGTACAC 11340 ACCGCCCGTCGCTACTACCG ATTGGATGGT TTAGTGAGGC CCTCGGATCG GCCCCGCCGG 11400 GGTCGGCCCACGGCCCTGGC GGAGCGCTGA GAAGACGGTC GAACTTGACT ATCTAGAGGA 11460 AGTAAAAGTCGTAACAAGGT TTCCGTAGGT GAACCTGCGG AAGGATCATT AAACGGGAGA 11520 CTGTGGAGGAGCGGCGGCGT GGCCCGCTCT CCCCGTCTTG TGTGTGTCCT CGCCGGGAGG 11580 CGCGTGCGTCCCGGGTCCCG TCGCCCGCGT GTGGAGCGAG GTGTCTGGAG TGAGGTGAGA 11640 GAAGGGGTGGGTGGGGTCGG TCTGGGTCCG TCTGGGACCG CCTCCGATTT CCCCTCCCCC 11700 TCCCCTCTCCCTCGTCCGGC TCTGACCTCG CCACCCTACC GCGGCGGCGG CTGCTCGCGG 11760 GCGTCTTGCCTCTTTCCCGT CCGGCTCTTC CGTGTCTACG AGGGGCGGTA CGTCGTTACG 11820 GGTTTTTGACCCGTCCCGGG GGCGTTCGGT CGTCGGGGCG CGCGCTTTGC TCTCCCGGCA 11880 CCCATCCCCGCCGCGGCTCT GGCTTTTCTA CGTTGGCTGG GGCGGTTGTC GCGTGTGGGG 11940 GGATGTGAGTGTCGCGTGTG GGCTCGCCCG TCCCGATGCC ACGCTTTTCT GGCCTCGCGT 12000 GTCCTCCCCGCTCCTGTCCC GGGTACCTAG CTGTCGCGTT CCGGCGCGGA GGTTTAAGGA 12060 CCCCGGGGGGGTCGCCCTGC CGCCCCCAGG GTCGGGGGGC GGTGGGGCCC GTAGGGAAGT 12120 CGGTCGTTCGGGCGGCTCTC CCTCAGACTC CATGACCCTC CTCCCCCCGC TGCCGCCGTT 12180 CCCGAGGCGGCGGTCGTGTG GGGGGGTGGA TGTCTGGAGC CCCCTCGGGC GCCGTGGGGG 12240 CCCGACCCGCGCCGCCGGCT TGCCCGATTT CCGCGGGTCG GTCCTGTCGG TGCCGGTCGT 12300 GGGTTCCCGTGTCGTTCCCG TGTTTTTCCG CTCCCGACCC TTTTTTTTTC CTCCCCCCCA 12360 CACGTGTCTCGTTTCGTTCC TGCTGGCCGG CCTGAGGCTA CCCCTCGGTC CATCTGTTCT 12420 CCTCTCTCTCCGGGGAGAGG AGGGCGGTGG TCGTTGGGGG ACTGTGCCGT CGTCAGCACC 12480 CGTGAGTTCGCTCACACCCG AAATACCGAT ACGACTCTTA GCGGTGGATC ACTCGGCTCG 12540 TGCGTCGATGAAGAACGCAG CTAGCTGCGA GAATTAATGT GAATTGCAGG ACACATTGAT 12600 CATCGACACTTCGAACGCAC TTGCGGCCCC GGGTTCCTCC CGGGGCTACG CCTGTCTGAG 12660 CGTCGGTTGACGATCAATCG CGTCACCCGC TGCGGTGGGT GCTGCGCGGC TGGGAGTTTG 12720 CTCGCAGGGCCAACCCCCCA ACCCGGGTCG GGCCCTCCGT CTCCCGAAGT TCAGACGTGT 12780 GGGCGGTTGTCGGTGTGGCG CGCGCGCCCG CGTCGCGGAG CCTGGTCTCC CCCGCGCATC 12840 CGCGCTCGCGGCTTCTTCCC GCTCCGCCGT TCCCGCCCTC GCCCGTGCAC CCCGGTCCTG 12900 GCCTCGCGTCGGCGCCTCCC GGACCGCTGC CTCACCAGTC TTTCTCGGTC CCGTGCCCCG 12960 TGGGAACCCACCGCGCCCCC GTGGCGCCCG GGGGTGGGCG CGTCCGCATC TGCTCTGGTC 13020 GAGGTTGGCGGTTGAGGGTG TGCGTGCGCC GAGGTGGTGG TCGGTCCCCT GCGGCCGCGG 13080 GGTTGTCGGGGTGGCGGTCG ACGAGGGCCG GTCGGTCGCC TGCGGTGGTT GTCTGTGTGT 13140 GTTTGGGTCTTGCGCTGGGG GAGGCGGGGT CGACCGCTCG CGGGGTTGGC GCGGTCGCCC 13200 GGCGCCGCGCACCCTCCGGC TTGTGTGGAG GGAGAGCGAG GGCGAGAACG GAGAGAGGTG 13260 GTATCCCCGGTGGCGTTGCG AGGGAGGGTT TGGCGTCCCG CGTCCGTCCG TCCCTCCCTC 13320 CCTCGGTGGGCGCCTTCGCG CCGCACGCGG CCGCTAGGGG CGGTCGGGGC CCGTGGCCCC 13380 CGTGGCTCTTCTTCGTCTCC GCTTCTCCTT CACCCGGGCG GTACCCGCTC CGGCGCCGGC 13440 CCGCGGGACGCCGCGGCGTC CGTGCGCCGA TGCGAGTCAC CCCCGGGTGT TGCGAGTTCG 13500 GGGAGGGAGAGGGCCTCGCT GACCCGTTGC GTCCCGGCTT CCCTGGGGGG GACCCGGCGT 13560 CTGTGGGCTGTGCGTCCCGG GGGTTGCGTG TGAGTAAGAT CCTCCACCCC CGCCGCCCTC 13620 CCCTCCCGCCGGCCTCTCGG GGACCCCCTG AGACGGTTCG CCGGCTCGTC CTCCCGTGCC 13680 GCCGGGTGCCGTCTCTTTCC CGCCCGCCTC CTCGCTCTCT TCTTCCCGCG GCTGGGCGCG 13740 TGTCCCCCCTTTCTGACCGC GACCTCAGAT CAGACGTGGC GACCCGCTGA ATTTAAGCAT 13800 ATTAGTCAGCGGAGGAAAAG AAACTAACCA GGATTCCCTC AGTAACGGCG AGTGAACAGG 13860 GAAGAGCCCAGCGCCGAATC CCCGCCGCGC GTCGCGGCGT GGGAAATGTG GCGTACGGAA 13920 GACCCACTCCCCGGCGCCGC TCGTGGGGGG CCCAAGTCCT TCTGATCGAG GCCCAGCCCG 13980 TGGACGGTGTGAGGCCGGTA GCGGCCCCGG CGCGCCGGGC TCGGGTCTTC CCGGAGTCGG 14040 GTTGCTTGGGAATGCAGCCC AAAGCGGGTG GTAAACTCCA TCTAAGGCTA AATACCGGCA 14100 CGAGACCGATAGTCAACAAG TACCGTAAGG GAAAGTTGAA AAGAACTTTG AAGAGAGAGT 14160 TCAAGAGGGCGTGAAACCGT TAAGAGGTAA ACGGGTGGGG TCCGCGCAGT CCGCCCGGAG 14220 GATTCAACCCGGCGGCGCGC GTCCGGCCGT GCCCGGTGGT CCCGGCGGAT CTTTCCCGCT 14280 CCCCGTTCCTCCCGACCCCT CCACCCGCGC GTCGTTCCCC TCTTCCTCCC CGCGTCCGGC 14340 GCCTCCGGCGGCGGGCGCGG GGGGTGGTGT GGTGGTGGCG CGCGGGCGGG GCCGGGGGTG 14400 GGGTCGGCGGGGGACCGCCC CCGGCCGGCG ACCGGCCGCC GCCGGGCGCA CTTCCACCGT 14460 GGCGGTGCGCCGCGACCGGC TCCGGGACGG CCGGGAAGGC CCGGTGGGGA AGGTGGCTCG 14520 GGGGGGGCGGCGCGTCTCAG GGCGCGCCGA ACCACCTCAC CCCGAGTGTT ACAGCCCTCC 14580 GGCCGCGCTTTCGCCGAATC CCGGGGCCGA GGAAGCCAGA TACCCGTCGC CGCGCTCTCC 14640 CTCTCCCCCCGTCCGCCTCC CGGGCGGGCG TGGGGGTGGG GGCCGGGCCG CCCCTCCCAC 14700 GGCGCGACCGCTCTCCCACC CCCCTCCGTC GCCTCTCTCG GGGCCCGGTG GGGGGCGGGG 14760 CGGACTGTCCCCAGTGCGCC CCGGGCGTCG TCGCGCCGTC GGGTCCCGGG GGGACCGTCG 14820 GTCACGCGTCTCCCGACGAA GCCGAGCGCA CGGGGTCGGC GGCGATGTCG GCTACCCACC 14880 CGACCCGTCTTGAAACACGG ACCAAGGAGT CTAACGCGTG CGCGAGTCAG GGGCTCGTCC 14940 GAAAGCCGCCGTGGCGCAAT GAAGGTGAAG GGCCCCGCCC GGGGGCCCGA GGTGGGATCC 15000 CGAGGCCTCTCCAGTCCGCC GAGGGCGCAC CACCGGCCCG TCTCGCCCGC CGCGCCGGGG 15060 AGGTGGAGCACGAGCGTACG CGTTAGGACC CGAAAGATGG TGAACTATGC TTGGGCAGGG 15120 CGAAGCCAGAGGAAACTCTG GTGGAGGTCC GTAGCGGTCC TGACGTGCAA ATCGGTCGTC 15180 CGACCTGGGTATAGGGGCGA AAGACTAATC GAACCATCTA GTAGCTGGTT CCCTCCGAAG 15240 TTTCCCTCAGGATAGCTGGC GCTCTCGCTC CCGACGTACG CAGTTTTATC CGGTAAAGCG 15300 AATGATTAGAGGTCTTGGGG CCGAAACGAT CTCAACCTAT TCTCAAACTT TAAATGGGTA 15360 AGAAGCCCGGCTCGCTGGCG TGGAGCCGGG CGTGGAATGC GAGTGCCTAG TGGGCCACTT 15420 TTGGTAAGCAGAACTGGCGC TGCGGGATGA ACCGAACGCC GGGTTAAGGC GCCCGATGCC 15480 GACGCTCATCAGACCCCAGA AAAGGTGTTG GTTGATATAG ACAGCAGGAC GGTGGCCATG 15540 GAAGTCGGAATCCGCTAAGG AGTGTGTAAC AACTCACCTG CCGAATCAAC TAGCCCTGAA 15600 AATGGATGGCGCTGGAGCGT CGGGCCCATA CCCGGCCGTC GCCGCAGTCG GAACGGAACG 15660 GGACGGGAGCGGCCGCGGGT GCGCGTCTCT CGGGGTCGGG GGTGCGTGGC GGGGGCCCGT 15720 CCCCCGCCTCCCCTCCGCGC GCCGGGTTCG CCCCCGCGGC GTCGGGCCCC GCGGAGCCTA 15780 CGCCGCGACGAGTAGGAGGG CCGCTGCGGT GAGCCTTGAA GCCTAGGGCG CGGGCCCGGG 15840 TGGAGCCGCCGCAGGTGCAG ATCTTGGTGG TAGTAGCAAA TATTCAAACG AGAACTTTGA 15900 AGGCCGAAGTGGAGAAGGGT TCCATGTGAA CAGCAGTTGA ACATGGGTCA GTCGGTCCTG 15960 AGAGATGGGCGAGTGCCGTT CCGAAGGGAC GGGCGATGGC CTCCGTTGCC CTCGGCCGAT 16020 CGAAAGGGAGTCGGGTTCAG ATCCCCGAAT CCGGAGTGGC GGAGATGGGC GCCGCGAGGC 16080 CAGTGCGGTAACGCGACCGA TCCCGGAGAA GCCGGCGGGA GGCCTCGGGG AGAGTTCTCT 16140 TTTCTTTGTGAAGGGCAGGG CGCCCTGGAA TGGGTTCGCC CCGAGAGAGG GGCCCGTGCC 16200 TTGGAAAGCGTCGCGGTTCC GGCGGCGTCC GGTGAGCTCT CGCTGGCCCT TGAAAATCCG 16260 GGGGAGAGGGTGTAAATCTC GCGCCGGGCC GTACCCATAT CCGCAGCAGG TCTCCAAGGT 16320 GAACAGCCTCTGGCATGTTG GAACAATGTA GGTAAGGGAA GTCGGCAAGC CGGATCCGTA 16380 ACTTCGGGATAAGGATTGGC TCTAAGGGCT GGGTCGGTCG GGCTGGGGCG CGAAGCGGGG 16440 CTGGGCGCGCGCCGCGGCTG GACGAGGCGC CGCCGCCCTC TCCCACGTCC GGGGAGACCC 16500 CCCGTCCTTTCCGCCCGGGC CCGCCCTCCC CTCTTCCCCG CGGGGCCCCG TCGTCCCCCG 16560 CGTCGTCGCCACCTCTCTTC CCCCCTCCTT CTTCCCGTCG GGGGGCGGGT CGGGGGTCGG 16620 CGCGCGGCGCGGGCTCCGGG GCGGCGGGTC CAACCCCGCG GGGGTTCCGG AGCGGGAGGA 16680 ACCAGCGGTCCCCGGTGGGG CGGGGGGCCC GGACACTCGG GGGGCCGGCG GCGGCGGCGA 16740 CTCTGGACGCGAGCCGGGCC CTTCCCGTGG ATCGCCTCAG CTGCGGCGGG CGTCGCGGCC 16800 GCTCCCGGGGAGCCCGGCGG GTGCCGGCGC GGGTCCCCTC CCCGCGGGGC CTCGCTCCAC 16860 CCCCCCATCGCCTCTCCCGA GGTGCGTGGC GGGGGCGGGC GGGCGTGTCC CGCGCGTGTG 16920 GGGGGAACCTCCGCGTCGGT GTTCCCCCGC CGGGTCCGCC CCCCGGGCCG CGGTTTTCCG 16980 CGCGGCGCCCCCGCCTCGGC CGGCGCCTAG CAGCCGACTT AGAACTGGTG CGGACCAGGG 17040 GAATCCGACTGTTTAATTAA AACAAAGCAT CGCGAAGGCC CGCGGCGGGT GTTGACGCGA 17100 TGTGATTTCTGCCCAGTGCT CTGAATGTCA AAGTGAAGAA ATTCAATGAA GCGCGGGTAA 17160 ACGGCGGGAGTAACTATGAC TCTCTTAAGG TAGCCAAATG CCTCGTCATC TAATTAGTGA 17220 CGCGCATGAATGGATGAACG AGATTCCCAC TGTCCCTACC TACTATCCAG CGAAACCACA 17280 GCCAAGGGAACGGGCTTGGC GGAATCAGCG GGGAAAGAAG ACCCTGTTGA GCTTGACTCT 17340 AGTCTGGCACGGTGAAGAGA CATGAGAGGT GTAGAATAAG TGGGAGGCCC CCGGCGCCCG 17400 GCCCCGTCCTCGCGTCGGGG TCGGGGCACG CCGGCCTCGC GGGCCGCCGG TGAAATACCA 17460 CTACTCTCATCGTTTTTTCA CTGACCCGGT GAGGCGGGGG GGCGAGCCCC GAGGGGCTCT 17520 CGCTTCTGGCGCCAAGCGTC CGTCCCGCGC GTGCGGGCGG GCGCGACCCG CTCCGGGGAC 17580 AGTGCCAGGTGGGGAGTTTG ACTGGGGCGG TACACCTGTC AAACGGTAAC GCAGGTGTCC 17640 TAAGGCGAGCTCAGGGAGGA CAGAAACCTC CCGTGGAGCA GAAGGGCAAA AGCTCGCTTG 17700 ATCTTGATTTTCAGTACGAA TACAGACCGT GAAAGCGGGG CCTCACGATC CTTCTGACCT 17760 TTTGGGTTTTAAGCAGGAGG TGTCAGAAAA GTTACCACAG GGATAACTGG CTTGTGGCGG 17820 CCAAGCGTTCATAGCGACGT CGCTTTTTGA TCCTTCGATG TCGGCTCTTC CTATCATTGT 17880 GAAGCAGAATTCACCAAGCG TTGGATTGTT CACCCACTAA TAGGGAACGT GAGCTGGGTT 17940 TAGACCGTCGTGAGACAGGT TAGTTTTACC CTACTGATGA TGTGTTGTTG CCATGGTAAT 18000 CCTGCTCAGTACGAGAGGAA CCGCAGGTTC AGACATTTGG TGTATGTGCT TGGCTGAGGA 18060 GCCAATGGGGCGAAGCTACC ATCTGTGGGA TTATGACTGA ACGCCTCTAA GTCAGAATCC 18120 GCCCAAGCGGAACGATACGG CAGCGCCGAA GGAGCCTCGG TTGGCCCCGG ATAGCCGGGT 18180 CCCCGTCCGTCCCGCTCGGC GGGGTCCCCG CGTCGCCCCG CGGCGGCGCG GGGTCTCCCC 18240 CCGCCGGGCGTCGGGACCGG GGTCCGGTGC GGAGAGCCGT TCGTCTTGGG AAACGGGGTG 18300 CGGCCGGAAAGGGGGCCGCC CTCTCGCCCG TCACGTTGAA CGCACGTTCG TGTGGAACCT 18360 GGCGCTAAACCATTCGTAGA CGACCTGCTT CTGGGTCGGG GTTTCGTACG TAGCAGAGCA 18420 GCTCCCTCGCTGCGATCTAT TGAAAGTCAG CCCTCGACAC AAGGGTTTGT CTCTGCAGCA 18480 TTTCCCGTCGCACGCCCGCT CGCTCGCACG CGACCGTGTC GCCGCCCGGG CGTCACGGGC 18540 GCGGTCGCCTCGGCCCCCGC GCGGTTGCCC GAACGACCGT GTGGTGGTTG GGGGGGGGGG 18600 CGTCTTCTCCTCCGTCTCCC GAGGACGGTT CGTTTCTCTT TCCCCTTCCG TCGCTCGGAT 18660 TGGGTGTGGGAGCCTCGTGC CGTCGCGACC GCGGCCTGCC GTCGCCTGCC GCCGCATCCT 18720 CTTGCCCTCCGGCCTTGGCC AAGCCGGAGG GCGGAGGAGG GGGATCGGCG GCGGCGGCCC 18780 CCGCGGCGCGGTGACGCACG GTGGGATCCC CATCCTCGGC GCGTCCGTCG GGGACGGCGA 18840 GTTGGAGGGGCGGGAGGGGT TTTTCCCGTG AACGCCGCGT TCGGCGCCAG GCCTCTGCCG 18900 GCCGGGGGGGCGCTCTCTCC GCCCGAGCAT CCCCACTCCC GCCCCTCCTC TTCGCGGGCG 18960 GCGGCGGCGACGTGCGTACG AGGGGAGGAT GTCGCGGTGT GGAGGCGGAG AGGGTCCGCC 19020 GCGGCGCCTCTTCCATTTTT TCCCCCCCAA CTTCGGAGGT CGACCAGTAC TCCGGGCGGC 19080 ACTTTGTTTTTTTTTTTTCC CCCGATGCTG GAGGTCGACC AGATGTCCGA AAGTGTCGAC 19140 CCCCCCCCCCCCCCCCGGCG CGGAGCGGCG GGGCCACTCT GGACTCTTTT TTTTTTCCCC 19200 TTTTTTTTTTTTAAATTCCT GGAACCTTTA GGTCGACCAG TTGTCCGTCT TTTACTTTTT 19260 CATATAGGTCGACCAGTACT CCGGGTGGTA CTTTGTCTTT TTCTGAAAAT CCCAGACCTT 19320 GACCAGATATCCGAAAGTCC TCTCTTTCCC TTTACTCTTC CCCACAGCGA TTCTCTGGTC 19380 TTTTTTTTTTTTTGGTGTGC CTCTTTTTGA CTTATATACA TGTAAATAGT GTGTACTTTT 19440 ATATACTTATAGGAGGAGGT CGACCAGTAC TCCGGGCGAC ACTTTGTTTT TTTTTTGTTT 19500 TCCACCGATGATGGAGGTCG ACCAGATGTC CGAAAGTGTC CCGTCCCCCC CCTCCCTTTT 19560 CCGCGACGCGGCGGGCTCAC TCTGGACTCT TTTTTTTTTT TTTTTTTTTT TTTAAACCCC 19620 TGGAACCTTAAGGTCGACCA GTTGTCCGTC TTTCACTCAT TCATATAGGT CGACCGTTTC 19680 TACTTTGTCTTTTTCTGAAA ATCGCAGAGG TCGACCAGAT GTCAGAAAGT CTGGTGGTGG 19740 ATAAATTATCTGATCTAGAT TTGTTTTTCT GTTTTTCAGT TTTGTGTTGT TTTGTGGTCG 19800 TTTGTGTTGTTTTGTTTTGT TTTGTTTTGT TTTGTTTTGT TTTGTTTTGT TTTGTTTTGT 19860 TTTGTGTTGTGTTGTGTTGT GTTGTGTTGG GTTGGGTTGG GTTGGGTTGG GTTGGGTTGT 19920 GTTGGGTTGGGTTGGGTTGT GTTGTTTGGT TTTGTGTTGT TTGGTGTTGT TGGTTTTTGG 19980 TTGTTTGCTGTTGTTTTGTG TTTTGCGGGT CGAACAGTTG TCCCTAACCG AGTTTTTGTT 20040 TACACAAACATGCACTTTTT TTAAAATAAA TTTTTAAAAT AAATGCGAAA ATCGACTTTG 20100 TATCCCTTTCCTTCTCTCTC TTTTTTAAAA ATTTTCTTTG TGTGTGTGTG TGTGTGCAAT 20160 TGTGTGTGTGTGCGTGTGTG TGTGTGTGTG CGTGCAGCGT GCGCGCGCTC GTTTTATGTG 20220 TACTTATAATAATAGGTCGC CGGGTGGTGG TAGCTTCCCG GACTCCAGAG GCAGAGTAAA 20280 GCAGACTTCTGAGTTCGAGG CCAGCCTGGT CTACAGAGGA ACCCTGTCTC GAAAAAGCAG 20340 AATAAATACATACATACATA CATACATACA TACATACATA CATACATACA TACATATGAG 20400 GTTGACCAGTTGTCAATCCT TTAGAATTTT GTTTTTAATT AATGTGATAG AGAGATAGAT 20460 AATAGATAGATGGATAGAGT GATACAAATA TAGGTTTTTT TTTCAGTAAA TATGAGGTTG 20520 ATTAACCACTTTTCCCTTTT TAGGTTTTTT TTTTTTTCCC CTGTCCATGT GGTTGCTGGG 20580 ATTTGAACTCAGGACCCTGG CAGGTCAACT GGAAAACGTG TTTTCTATAT ATATAAATAG 20640 TGGTCTGTCTGCTGTTTGTT TGTTTGCTTG CTTGCTTGCT TGCTTGCTTG CTTGCTTGCT 20700 TGCTTTTTTTTTTCTTCTGA GACAGTATTT CTCTGTGTAA CCTGGTGCCC TGAAACTCAC 20760 TCTGTAGACCAGCCTGGCCT CAATCGAACT CAGAAATCCT CCTGCCTCTT GTCTACCTCC 20820 CAATTTTGGAGTAAAGGTGT GCTACACCAC TGCCTGGCAT TATTATCATT ATCATTATTA 20880 ATTTTATTATTAGACAGAAC GAAATCAACT AGTTGGTCCT GTTTCGTTAA TTCATTTGAA 20940 ATTAGTTGGACCAATTAGTT GGCTGGTTTG GGAGGTTTCT TTTGTTTCCG ATTTGGGTGT 21000 TTGTGGGGCTGGGGATCAGG TATCTCAACG GAATGCATGA AGGTTAAGGT GAGATGGCTC 21060 GATTTTTGTAAAGATTACTT TTCTTAGTCT GAGGAAAAAA TAAAATAATA TTGGGCTACG 21120 TTTCATTGCTTCATTTCTAT TTCTCTTTCT TTCTTTCTTT CTTTCAGATA AGGAGGTCGG 21180 CCAGTTCCTCCTGCCTTCTG GAAGATGTAG GCATTGCATT GGGAAAAGCA TTGTTTGAGA 21240 GATGTGCTAGTGAACCAGAG AGTTTGGATG TCAAGCCGTA TAATGTTTAT TACAATATAG 21300 AAAAGTTCTAACAAAGTGAT CTTTAACTTT TTTTTTTTTT TTTCTCCTTC TACTTCTACT 21360 TGTTCTCACTCTGCCACCAA CGCGCTTTGT ACATTGAATG TGAGCTTTGT TTTGCTTAAC 21420 AGACATATATTTTTTCTTTT GGTTTTGCTT GACATGGTTT CCCTTTCTAT CCGTGCAGGG 21480 TTCCCAGACGGCCTTTTGAG AATAAAATGG GAGGCCAGAA CCAAAGTCTT TTGAATAAAG 21540 CACCACAACTCTAACCTGTT TGGCTGTTTT CCTTCCCAAG GCACAGATCT TTCCCAGCAT 21600 GGAAAAGCATGTAGCAGTTG TAGGACACAC TAGACGAGAG CACCAGATCT CATTGTGGGT 21660 GGTTGTGAACCACCCACCAT GTGGTTGCCT GGGATTTGAA CTCAGGATCT TCAGAAGACG 21720 AGTCAGGGCTCTAAACCGAT GAGCCATCTC TCCAGCCCTC CTACATTCCT TCTTAAGGCA 21780 TGAATGATCCCAGCATGGGA AGACAGTCTG CCCTCTTTGT GGTATATCAC CATATACTCA 21840 ATAAAATAATGAAATGAATG AAGTCTCCAC GTATTTATTT CTTCGAGCTA TCTAAATTCT 21900 CTCACAGCACCTCCCCCTCC CCCACACTGC CTTTCTCCCT ATGTTTGGGT GGGGCTGGGG 21960 GAGGGGTGGGGTGGGGGCAG GGATCTGCAT GTCTTCTTGC AGGTCTGTGA ACTATTTGCG 22020 ATGGCCTGGTTCTCTGAACT GTTGAGCCTT GTCTATCCAG AGGCTGACTG GCTAGTTTTC 22080 TACCTGAAGTCCCTGAGTGA TGATTTCCCT GTGAATTC 22118 (2) INFORMATION FOR SEQ ID NO: 17:(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 42999 base pairs (B) TYPE:nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULETYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v)FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 17: GCTGACACGC TGTCCTCTGG CGACCTGTCG TCGGAGAGGTTGGGCCTCCG GATGCGCGCG 60 GGGCTCTGGC CTCACGGTGA CCGGCTAGCC GGCCGCGCTCCTGCCTTGAG CCGCCTGCCG 120 CGGCCCGCGG GCCTGCTGTT CTCTCGCGCG TCCGAGCGTCCCGACTCCCG GTGCCGGCCC 180 GGGTCCGGGT CTCTGACCCA CCCGGGGGCG GCGGGGAAGGCGGCGAGGGC CACCGTGCCC 240 CGTGCGCTCT CCGCTGCGGG CGCCCGGGGC GCCGCACAACCCCACCCGCT GGCTCCGTGC 300 CGTGCGTGTC AGGCGTTCTC GTCTCCGCGG GGTTGTCCGCCGCCCCTTCC CCGGAGTGGG 360 GGGTGGCCGG AGCCGATCGG CTCGCTGGCC GGCCGGCCTCCGCTCCCGGG GGGCTCTTCG 420 ATCGATGTGG TGACGTCGTG CTCTCCCGGG CCGGGTCCGAGCCGCGACGG GCGAGGGGCG 480 GACGTTCGTG GCGAACGGGA CCGTCCTTCT CGCTCCGCCCGCGCGGTCCC CTCGTCTGCT 540 CCTCTCCCCG CCCGCCGGCC GGCGTGTGGG AAGGCGTGGGGTGCGGACCC CGGCCCGACC 600 TCGCCGTCCC GCCCGCCGCC TTCGCTTCGC GGGTGCGGGCCGGCGGGGTC CTCTGACGCG 660 GCAGACAGCC CTGCCTGTCG CCTCCAGTGG TTGTCGACTTGCGGGCGGCC CCCCTCCGCG 720 GCGGTGGGGG TGCCGTCCCG CCGGCCCGTC GTGCTGCCCTCTCGGGGGGG GTTTGCGCGA 780 GCGTCGGCTC CGCCTGGGCC CTTGCGGTGC TCCTGGAGCGCTCCGGGTTG TCCCTCAGGT 840 GCCCGAGGCC GAACGGTGGT GTGTCGTTCC CGCCCCCGGCGCCCCCTCCT CCGGTCGCCG 900 CCGCGGTGTC CGCGCGTGGG TCCTGAGGGA GCTCGTCGGTGTGGGGTTCG AGGCGGTTTG 960 AGTGAGACGA GACGAGACGC GCCCCTCCCA CGCGGGGAAGGGCGCCCGCC TGCTCTCGGT 1020 GAGCGCACGT CCCGTGCTCC CCTCTGGCGG GTGCGCGCGGGCCGTGTGAG CGATCGCGGT 1080 GGGTTCGGGC CGGTGTGACG CGTGCGCCGG CCGGCCGCCGAGGGGCTGCC GTTCTGCCTC 1140 CGACCGGTCG TGTGTGGGTT GACTTCGGAG GCGCTCTGCCTCGGAAGGAA GGAGGTGGGT 1200 GGACGGGGGG GCCTGGTGGG GTTGCGCGCA CGCGCGCACCGGCCGGGCCC CCGCCCTGAA 1260 CGCGAACGCT CGAGGTGGCC GCGCGCAGGT GTTTCCTCGTACCGCAGGGC CCCCTCCCTT 1320 CCCCAGGCGT CCCTCGGCGC CTCTGCGGGC CCGAGGAGGAGCGGCTGGCG GGTGGGGGGA 1380 GTGTGACCCA CCCTCGGTGA GAAAAGCCTT CTCTAGCGATCTGAGAGGCG TGCCTTGGGG 1440 GTACCGGATC CCCCGGGCCG CCGCCTCTGT CTCTGCCTCCGTTATGGTAG CGCTGCCGTA 1500 GCGACCCGCT CGCAGAGGAC CCTCCTCCGC TTCCCCCTCGACGGGGTTGG GGGGGAGAAG 1560 CGAGGGTTCC GCCGGCCACC GCGGTGGTGG CCGAGTGCGGCTCGTCGCCT ACTGTGGCCC 1620 GCGCCTCCCC CTTCCGAGTC GGGGGAGGAT CCCGCCGGGCCGGGCCCGGC GCTCCCACCC 1680 AGCGGGTTGG GACGCGGCGG CCGGCGGGCG GTGGGTGTGCGCGCCCGGCG CTCTGTCCGG 1740 CGCGTGACCC CCTCCGTCCG CGAGTCGGCT CTCCGCCCGCTCCCGTGCCG AGTCGTGACC 1800 GGTGCCGACG ACCGCGTTTG CGTGGCACGG GGTCGGGCCCGCCTGGCCCT GGGAAAGCGT 1860 CCCACGGTGG GGGCGCGCCG GTCTCCCGGA GCGGGACCGGGTCGGAGGAT GGACGAGAAT 1920 CACGAGCGAC GGTGGTGGTG GCGTGTCGGG TTCGTGGCTGCGGTCGCTCC GGGGCCCCCG 1980 GTGGCGGGGC CCCGGGGCTC GCGAGGCGGT TCTCGGTGGGGGCCGAGGGC CGTCCGGCGT 2040 CCCAGGCGGG GCGCCGCGGG ACCGCCCTCG TGTCTGTGGCGGTGGGATCC CGCGGCCGTG 2100 TTTTCCTGGT GGCCCGGCCG TGCCTGAGGT TTCTCCCCGAGCCGCCGCCT CTGCGGGCTC 2160 CCGGGTGCCC TTGCCCTCGC GGTCCCCGGC CCTCGCCCGTCTGTGCCCTC TTCCCCGCCC 2220 GCCGCCCGCC GATCCTCTTC TTCCCCCCGA GCGGCTCACCGGCTTCACGT CCGTTGGTGG 2280 CCCCGCCTGG GACCGAACCC GGCACCGCCT CGTGGGGCGCCGCCGCCGGC CACTGATCGG 2340 CCCGGCGTCC GCGTCCCCCG GCGCGCGCCT TGGGGACCGGGTCGGTGGCG CGCCGCGTGG 2400 GGCCCGGTGG GCTTCCCGGA GGGTTCCGGG GGTCGGCCTGCGGCGCGTGC GGGGGAGGAG 2460 ACGGTTCCGG GGGACCGGCC GCGGCTGCGG CGGCGGCGGTGGTGGGGGGA GCCGCGGGGA 2520 TCGCCGAGGG CCGGTCGGCC GCCCCGGGTG CCCCGCGGTGCCGCCGGCGG CGGTGAGGCC 2580 CCGCGCGTGT GTCCCGGCTG CGGTCGGCCG CGCTCGAGGGGTCCCCGTGG CGTCCCCTTC 2640 CCCGCCGGCC GCCTTTCTCG CGCCTTCCCC GTCGCCCCGGCCTCGCCCGT GGTCTCTCGT 2700 CTTCTCCCGG CCCGCTCTTC CGAACCGGGT CGGCGCGTCCCCCGGGTGCG CCTCGCTTCC 2760 CGGGCCTGCC GCGGCCCTTC CCCGAGGCGT CCGTCCCGGGCGTCGGCGTC GGGGAGAGCC 2820 CGTCCTCCCC GCGTGGCGTC GCCCCGTTCG GCGCGCGCGTGCGCCCGAGC GCGGCCCGGT 2880 GGTCCCTCCC GGACAGGCGT TCGTGCGACG TGTGGCGTGGGTCGACCTCC GCCTTGCCGG 2940 TCGCTCGCCC TCTCCCCGGG TCGGGGGGTG GGGCCCGGGCCGGGGCCTCG GCCCCGGTCG 3000 CTGCCTCCCG TCCCGGGCGG GGGCGGGCGC GCCGGCCGGCCTCGGTCGCC CTCCCTTGGC 3060 CGTCGTGTGG CGTGTGCCAC CCCTGCGCCG GCGCCCGCCGGCGGGGCTCG GAGCCGGGCT 3120 TCGGCCGGGC CCCGGGCCCT CGACCGGACC GGCTGCGCGGGCGCTGCGGC CGCACGGCGC 3180 GACTGTCCCC GGGCCGGGCA CCGCGGTCCG CCTCTCGCTCGCCGCCCGGA CGTCGGGGCC 3240 GCCCCGCGGG GCGGGCGGAG CGCCGTCCCC GCCTCGCCGCCGCCCGCGGG CGCCGGCCGC 3300 GCGCGCGCGC GCGTGGCCGC CGGTCCCTCC CGGCCGCCGGGCGCGGGTCG GGCCGTCCGC 3360 CTCCTCGCGG GCGGGCGCGA CGAAGAAGCG TCGCGGGTCTGTGGCGCGGG GCCCCCGGTG 3420 GTCGTGTCGC GTGGGGGGCG GGTGGTTGGG GCGTCCGGTTCGCCGCGCCC CGCCCCGGCC 3480 CCACCGGTCC CGGCCGCCGC CCCCGCGCCC GCTCGCTCCCTCCCGTCCGC CCGTCCGCGG 3540 CCCGTCCGTC CGTCCGTCCG TCGTCCTCCT CGCTTGCGGGGCGCCGGGCC CGTCCTCGCG 3600 AGGCCCCCCG GCCGGCCGTC CGGCCGCGTC GGGGGCTCGCCGCGCTCTAC CTTACCTACC 3660 TGGTTGATCC TGCCAGTAGC ATATGCTTGT CTCAAAGATTAAGCCATGCA TGTCTAAGTA 3720 CGCACGGCCG GTACAGTGAA ACTGCGAATG GCTCATTAAATCAGTTATGG TTCCTTTGGT 3780 CGCTCGCTCC TCTCCTACTT GGATAACTGT GGTAATTCTAGAGCTAATAC ATGCCGACGG 3840 GCGCTGACCC CCTTCGCGGG GGGGATGCGT GCATTTATCAGATCAAAACC AACCCGGTCA 3900 GCCCCTCTCC GGCCCCGGCC GGGGGGCGGG CGCCGGCGGCTTTGGTGACT CTAGATAACC 3960 TCGGGCCGAT CGCACGCCCC CCGTGGCGGC GACGACCCATTCGAACGTCT GCCCTATCAA 4020 CTTTCGATGG TAGTCGCCGT GCCTACCATG GTGACCACGGGTGACGGGGA ATCAGGGTTC 4080 GATTCCGGAG AGGGAGCCTG AGAAACGGCT ACCACATCCAAGGAAGGCAG CAGGCGCGCA 4140 AATTACCCAC TCCCGACCCG GGGAGGTAGT GACGAAAAATAACAATACAG GACTCTTTCG 4200 AGGCCCTGTA ATTGGAATGA GTCCACTTTA AATCCTTTAACGAGGATCCA TTGGAGGGCA 4260 AGTCTGGTGC CAGCAGCCGC GGTAATTCCA GCTCCAATAGCGTATATTAA AGTTGCTGCA 4320 GTTAAAAAGC TCGTAGTTGG ATCTTGGGAG CGGGCGGGCGGTCCGCCGCG AGGCGAGCCA 4380 CCGCCCGTCC CCGCCCCTTG CCTCTCGGCG CCCCCTCGATGCTCTTAGCT GAGTGTCCCG 4440 CGGGGCCCGA AGCGTTTACT TTGAAAAAAT TAGAGTGTTCAAAGCAGGCC CGAGCCGCCT 4500 GGATACCGCA GCTAGGAATA ATGGAATAGG ACCGCGGTTCTATTTTGTTG GTTTTCGGAA 4560 CTGAGGCCAT GATTAAGAGG GACGGCCGGG GGCATTCGTATTGCGCCGCT AGAGGTGAAA 4620 TTCTTGGACC GGCGCAAGAC GGACCAGAGC GAAAGCATTTGCCAAGAATG TTTTCATTAA 4680 TCAAGAACGA AAGTCGGAGG TTCGAAGACG ATCAGATACCGTCGTAGTTC CGACCATAAA 4740 CGATGCCGAC CGGCGATGCG GCGGCGTTAT TCCCATGACCCGCCGGGCAG CTTCCGGGAA 4800 ACCAAAGTCT TTGGGTTCCG GGGGGAGTAT GGTTGCAAAGCTGAAACTTA AAGGAATTGA 4860 CGGAAGGGCA CCACCAGGAG TGGAGCCTGC GGCTTAATTTGACTCAACAC GGGAAACCTC 4920 ACCCGGCCCG GACACGGACA GGATTGACAG ATTGATAGCTCTTTCTCGAT TCCGTGGGTG 4980 GTGGTGCATG GCCGTTCTTA GTTGGTGGAG CGATTTGTCTGGTTAATTCC GATAACGAAC 5040 GAGACTCTGG CATGCTAACT AGTTACGCGA CCCCCGAGCGGTCGGCGTCC CCCAACTTCT 5100 TAGAGGGACA AGTGGCGTTC AGCCACCCGA GATTGAGCAATAACAGGTCT GTGATGCCCT 5160 TAGATGTCCG GGGCTGCACG CGCGCTACAC TGACTGGCTCAGCGTGTGCC TACCCTACGC 5220 CGGCAGGCGC GGGTAACCCG TTGAACCCCA TTCGTGATGGGGATCGGGGA TTGCAATTAT 5280 TCCCCATGAA CGAGGGAATT CCCGAGTAAG TGCGGGTCATAAGCTTGCGT TGATTAAGTC 5340 CCTGCCCTTT GTACACACCG CCCGTCGCTA CTACCGATTGGATGGTTTAG TGAGGCCCTC 5400 GGATCGGCCC CGCCGGGGTC GGCCCACGGC CCTGGCGGAGCGCTGAGAAG ACGGTCGAAC 5460 TTGACTATCT AGAGGAAGTA AAAGTCGTAA CAAGGTTTCCGTAGGTGAAC CTGCGGAAGG 5520 ATCATTAACG GAGCCCGGAG GGCGAGGCCC GCGGCGGCGCCGCCGCCGCC GCGCGCTTCC 5580 CTCCGCACAC CCACCCCCCC ACCGCGACGC GGCGCGTGCGCGGGCGGGGC CCGCGTGCCC 5640 GTTCGTTCGC TCGCTCGTTC GTTCGCCGCC CGGCCCCGCCGCCGCGAGAG CCGAGAACTC 5700 GGGAGGGAGA CGGGGGGGAG AGAGAGAGAG AGAGAGAGAGAGAGAGAGAG AGAGAGAGAA 5760 AGAAGGGCGT GTCGTTGGTG TGCGCGTGTC GTGGGGCCGGCGGGCGGCGG GGAGCGGTCC 5820 CCGGCCGCGG CCCCGACGAC GTGGGTGTCG GCGGGCGCGGGGGCGGTTCT CGGCGGCGTC 5880 GCGGCGGGTC TGGGGGGGTC TCGGTGCCCT CCTCCCCGCCGGGGCCCGTC GTCCGGCCCC 5940 GCCGCGCCGG CTCCCCGTCT TCGGGGCCGG CCGGATTCCCGTCGCCTCCG CCGCGCCGCT 6000 CCGCGCCGCC GGGCACGGCC CCGCTCGCTC TCCCCGGCCTTCCCGCTAGG GCGTCTCGAG 6060 GGTCGGGGGC CGGACGCCGG TCCCCTCCCC CGCCTCCTCGTCCGCCCCCC CGCCGTCCAG 6120 GTACCTAGCG CGTTCCGGCG CGGAGGTTTA AAGACCCCTTGGGGGGATCG CCCGTCCGCC 6180 CGTGGGTCGG GGGCGGTGGT GGGCCCGCGG GGGAGTCCCGTCGGGAGGGG CCCGGCCCCT 6240 CCCGCGCCTC CACCGCGGAC TCCGCTCCCC GGCCGGGGCCGCGCCGCCGC CGCCGCCGCG 6300 GCGGCCGTCG GGTGGGGGCT TTACCCGGCG GCCGTCGCGCGCCTGCCGCG CGTGTGGCGT 6360 GCGCCCCGCG CCGTGGGGGC GGGAACCCCC GGGCGCCTGTGGGGTGGTGT CCGCGCTCGC 6420 CCCCGCGTGG GCGGCGCGCG CCTCCCCGTG GTGTGAAACCTTCCGACCCC TCTCCGGAGT 6480 CCGGTCCCGT TTGCTGTCTC GTCTGGCCGG CCTGAGGCAACCCCCTCTCC TCTTGGGCGG 6540 GGGGGGCGGG GGGACGTGCC GCGCCAGGAA GGGCCTCCTCCCGGTGCGTC GTCGGGAGCG 6600 CCCTCGCCAA ATCGACCTCG TACGACTCTT AGCGGTGGATCACTCGGCTC GTGCGTCGAT 6660 GAAGAACGCA GCTAGCTGCG AGAATTAATG TGAATTGCAGGACACATTGA TCATCGACAC 6720 TTCGAACGCA CTTGCGGCCC CGGGTTCCTC CCGGGGCTACGCCTGTCTGA GCGTCGCTTG 6780 CCGATCAATC GCCCCGGGGG TGCCTCCGGG CTCCTCGGGGTGCGCGGCTG GGGGTTCCCT 6840 CGCAGGGCCC GCCGGGGGCC CTCCGTCCCC CTAAGCGCAGACCCGGCGGC GTCCGCCCTC 6900 CTCTTGCCGC CGCGCCCGCC CCTTCCCCCT CCCCCCGCGGGCCCTGCGTG GTCACGCGTC 6960 GGGTGGCGGG GGGGAGAGGG GGGCGCGCCC GGCTGAGAGAGACGGGGAGG GCGGCGCCGC 7020 CGCCGGAAGA CGGAGAGGGA AAGAGAGAGC CGGCTCGGGCCGAGTTCCCG TGGCCGCCGC 7080 CTGCGGTCCG GGTTCCTCCC TCGGGGGGCT CCCTCGCGCCGCGCGCGGCT CGGGGTTCGG 7140 GGTTCGTCGG CCCCGGCCGG GTGGAAGGTC CCGTGCCCGTCGTCGTCGTC GTCGCGCGTC 7200 GTCGGCGGTG GGGGCGTGTT GCGTGCGGTG TGGTGGTGGGGGAGGAGGAA GGCGGGTCCG 7260 GAAGGGGAAG GGTGCCGGCG GGGAGAGAGG GTCGGGGGAGCGCGTCCCGG TCGCCGCGGT 7320 TCCGCCGCCC GCCCCCGGTG GCGGCCCGGC GTCCGGCCGACCGGCCGCTC CCCGCGCCCC 7380 TCCTCCTCCC CGCCGCCCCT CCTCCGAGGC CCCGCCCGTCCTCCTCGCCC TCCCCGCGCG 7440 TACGCGCGCG CGCCCGCCCG CCCGGCTCGC CTCGCGGCGCGTCGGCCGGG GCCGGGAGCC 7500 CGCCCCGCCG CCCGCCCGTG GCCGCGGCGC CGGGGTTCGCGTGTCCCCGG CGGCGACCCG 7560 CGGGACGCCG CGGTGTCGTC CGCCGTCGCG CGCCCGCCTCCGGCTCGCGG CCGCGCCGCG 7620 CCGCGCCGGG GCCCCGTCCC GAGCTTCCGC GTCGGGGCGGCGCGGCTCCG CCGCCGCGTC 7680 CTCGGACCCG TCCCCCCGAC CTCCGCGGGG GAGACGCGCCGGGGCGTGCG GCGCCCGTCC 7740 CGCCCCCGGC CCGTGCCCCT CCCTCCGGTC GTCCCGCTCCGGCGGGGCGG CGCGGGGGCG 7800 CCGTCGGCCG CGCGCTCTCT CTCCCGTCGC CTCTCCCCCTCGCCGGGCCC GTCTCCCGAC 7860 GGAGCGTCGG GCGGGCGGTC GGGCCGGCGC GATTCCGTCCGTCCGTCCGC CGAGCGGCCC 7920 GTCCCCCTCC GAGACGCGAC CTCAGATCAG ACGTGGCGACCCGCTGAATT TAAGCATATT 7980 AGTCAGCGGA GGAAAAGAAA CTAACCAGGA TTCCCTCAGTAACGGCGAGT GAACAGGGAA 8040 GAGCCCAGCG CCGAATCCCC GCCCCGCGGG GCGCGGGACATGTGGCGTAC GGAAGACCCG 8100 CTCCCCGGCG CCGCTCGTGG GGGGCCCAAG TCCTTCTGATCGAGGCCCAG CCCGTGGACG 8160 GTGTGAGGCC GGTAGCGGCC GGCGCGCGCC CGGGTCTTCCCGGAGTCGGG TTGCTTGGGA 8220 ATGCAGCCCA AAGCGGGTGG TAAACTCCAT CTAAGGCTAAATACCGGCAC GAGACCGATA 8280 GTCAACAAGT ACCGTAAGGG AAAGTTGAAA AGAACTTTGAAGAGAGAGTT CAAGAGGGCG 8340 TGAAACCGTT AAGAGGTAAA CGGGTGGGGT CCGCGCAGTCCGCCCGGAGG ATTCAACCCG 8400 GCGGCGGGTC CGGCCGTGTC GGCGGCCCGG CGGATCTTTCCCGCCCCCCG TTCCTCCCGA 8460 CCCCTCCACC CGCCCTCCCT TCCCCCGCCG CCCCTCCTCCTCCTCCCCGG AGGGGGCGGG 8520 CTCCGGCGGG TGCGGGGGTG GGCGGGCGGG GCCGGGGGTGGGGTCGGCGG GGGACCGTCC 8580 CCCGACCGGC GACCGGCCGC CGCCGGGCGC ATTTCCACCGCGGCGGTGCG CCGCGACCGG 8640 CTCCGGGACG GCTGGGAAGG CCCGGCGGGG AAGGTGGCTCGGGGGGCCCC GTCCGTCCGT 8700 CCGTCCTCCT CCTCCCCCGT CTCCGCCCCC CGGCCCCGCGTCCTCCCTCG GGAGGGCGCG 8760 CGGGTCGGGG CGGCGGCGGC GGCGGCGGTG GCGGCGGCGGCGGGGGCGGC GGGACCGAAA 8820 CCCCCCCCGA GTGTTACAGC CCCCCCGGCA GCAGCACTCGCCGAATCCCG GGGCCGAGGG 8880 AGCGAGACCC GTCGCCGCGC TCTCCCCCCT CCCGGCGCCCACCCCCGCGG GGAATCCCCC 8940 GCGAGGGGGG TCTCCCCCGC GGGGGCGCGC CGGCGTCTCCTCGTGGGGGG GCCGGGCCAC 9000 CCCTCCCACG GCGCGACCGC TCTCCCACCC CTCCTCCCCGCGCCCCCGCC CCGGCGACGG 9060 GGGGGGTGCC GCGCGCGGGT CGGGGGGCGG GGCGGACTGTCCCCAGTGCG CCCCGGGCGG 9120 GTCGCGCCGT CGGGCCCGGG GGAGGTTCTC TCGGGGCCACGCGCGCGTCC CCCGAAGAGG 9180 GGGACGGCGG AGCGAGCGCA CGGGGTCGGC GGCGACGTCGGCTACCCACC CGACCCGTCT 9240 TGAAACACGG ACCAAGGAGT CTAACACGTG CGCGAGTCGGGGGCTCGCAC GAAAGCCGCC 9300 GTGGCGCAAT GAAGGTGAAG GCCGGCGCGC TCGCCGGCCGAGGTGGGATC CCGAGGCCTC 9360 TCCAGTCCGC CGAGGGCGCA CCACCGGCCC GTCTCGCCCGCCGCGCCGGG GAGGTGGAGC 9420 ACGAGCGCAC GTGTTAGGAC CCGAAAGATG GTGAACTATGCCTGGGCAGG GCGAAGCCAG 9480 AGGAAACTCT GGTGGAGGTC CGTAGCGGTC CTGACGTGCAAATCGGTCGT CCGACCTGGG 9540 TATAGGGGCG AAAGACTAAT CGAACCATCT AGTAGCTGGTTCCCTCCGAA GTTTCCCTCA 9600 GGATAGCTGG CGCTCTCGCA GACCCGACGC ACCCCCGCCACGCAGTTTTA TCCGGTAAAG 9660 CGAATGATTA GAGGTCTTGG GGCCGAAACG ATCTCAACCTATTCTCAAAC TTTAAATGGG 9720 TAAGAAGCCC GGCTCGCTGG CGTGGAGCCG GGCGTGGAATGCGAGTGCCT AGTGGGCCAC 9780 TTTTGGTAAG CAGAACTGGC GCTGCGGGAT GAACCGAACGCCGGGTTAAG GCGCCCGATG 9840 CCGACGCTCA TCAGACCCCA GAAAAGGTGT TGGTTGATATAGACAGCAGG ACGGTGGCCA 9900 TGGAAGTCGG AATCCGCTAA GGAGTGTGTA ACAACTCACCTGCCGAATCA ACTAGCCCTG 9960 AAAATGGATG GCGCTGGAGC GTCGGGCCCA TACCCGGCCGTCGCCGGCAG TCGAGAGTGG 10020 ACGGGAGCGG CGGGGGCGGC GCGCGCGCGC GCGCGTGTGGTGTGCGTCGG AGGGCGGCGG 10080 CGGCGGCGGC GGCGGGGGTG TGGGGTCCTT CCCCCGCCCCCCCCCCCACG CCTCCTCCCC 10140 TCCTCCCGCC CACGCCCCGC TCCCCGCCCC CGGAGCCCCGCGGACGCTAC GCCGCGACGA 10200 GTAGGAGGGC CGCTGCGGTG AGCCTTGAAG CCTAGGGCGCGGGCCCGGGT GGAGCCGCCG 10260 CAGGTGCAGA TCTTGGTGGT AGTAGCAAAT ATTCAAACGAGAACTTTGAA GGCCGAAGTG 10320 GAGAAGGGTT CCATGTGAAC AGCAGTTGAA CATGGGTCAGTCGGTCCTGA GAGATGGGCG 10380 AGCGCCGTTC CGAAGGGACG GGCGATGGCC TCCGTTGCCCTCGGCCGATC GAAAGGGAGT 10440 CGGGTTCAGA TCCCCGAATC CGGAGTGGCG GAGATGGGCGCCGCGAGGCG TCCAGTGCGG 10500 TAACGCGACC GATCCCGGAG AAGCCGGCGG GAGCCCCGGGGAGAGTTCTC TTTTCTTTGT 10560 GAAGGGCAGG GCGCCCTGGA ATGGGTTCGC CCCGAGAGAGGGGCCCGTGC CTTGGAAAGC 10620 GTCGCGGTTC CGGCGGCGTC CGGTGAGCTC TCGCTGGCCCTTGAAAATCC GGGGGAGAGG 10680 GTGTAAATCT CGCGCCGGGC CGTACCCATA TCCGCAGCAGGTCTCCAAGG TGAACAGCCT 10740 CTGGCATGTT GGAACAATGT AGGTAAGGGA AGTCGGCAAGCCGGATCCGT AACTTCGGGA 10800 TAAGGATTGG CTCTAAGGGC TGGGTCGGTC GGGCTGGGGCGCGAAGCGGG GCTGGGCGCG 10860 CGCCGCGGCT GGACGAGGCG CGCGCCCCCC CCACGCCCGGGGCACCCCCC TCGCGGCCCT 10920 CCCCCGCCCC ACCCGCGCGC GCCGCTCGCT CCCTCCCCACCCCGCGCCCT CTCTCTCTCT 10980 CTCTCCCCCG CTCCCCGTCC TCCCCCCTCC CCGGGGGAGCGCCGCGTGGG GGCGCGGCGG 11040 GGGGAGAAGG GTCGGGGCGG CAGGGGCCGC GCGGCGGCCGCCGGGGCGGC CGGCGGGGGC 11100 AGGTCCCCGC GAGGGGGGCC CCGGGGACCC GGGGGGCCGGCGGCGGCGCG GACTCTGGAC 11160 GCGAGCCGGG CCCTTCCCGT GGATCGCCCC AGCTGCGGCGGGCGTCGCGG CCGCCCCCGG 11220 GGAGCCCGGC GGCGGCGCGG CGCGCCCCCC ACCCCCACCCCACGTCTCGG TCGCGCGCGC 11280 GTCCGCTGGG GGCGGGAGCG GTCGGGCGGC GGCGGTCGGCGGGCGGCGGG GCGGGGCGGT 11340 TCGTCCCCCC GCCCTACCCC CCCGGCCCCG TCCGCCCCCCGTTCCCCCCT CCTCCTCGGC 11400 GCGCGGCGGC GGCGGCGGCA GGCGGCGGAG GGGCCGCGGGCCGGTCCCCC CCGCCGGGTC 11460 CGCCCCCGGG GCCGCGGTTC CGCGCGCGCC TCGCCTCGGCCGGCGCCTAG CAGCCGACTT 11520 AGAACTGGTG CGGACCAGGG GAATCCGACT GTTTAATTAAAACAAAGCAT CGCGAAGGCC 11580 CGCGGCGGGT GTTGACGCGA TGTGATTTCT GCCCAGTGCTCTGAATGTCA AAGTGAAGAA 11640 ATTCAATGAA GCGCGGGTAA ACGGCGGGAG TAACTATGACTCTCTTAAGG TAGCCAAATG 11700 CCTCGTCATC TAATTAGTGA CGCGCATGAA TGGATGAACGAGATTCCCAC TGTCCCTACC 11760 TACTATCCAG CGAAACCACA GCCAAGGGAA CGGGCTTGGCGGAATCAGCG GGGAAAGAAG 11820 ACCCTGTTGA GCTTGACTCT AGTCTGGCAC GGTGAAGAGACATGAGAGGT GTAGAATAAG 11880 TGGGAGGCCC CCGGCGCCCC CCCGGTGTCC CCGCGAGGGGCCCGGGGCGG GGTCCGCGGC 11940 CCTGCGGGCC GCCGGTGAAA TACCACTACT CTGATCGTTTTTTCACTGAC CCGGTGAGGC 12000 GGGGGGGCGA GCCCGAGGGG CTCTCGCTTC TGGCGCCAAGCGCCCGCCCG GCCGGGCGCG 12060 ACCCGCTCCG GGGACAGTGC CAGGTGGGGA GTTTGACTGGGGCGGTACAC CTGTCAAACG 12120 GTAACGCAGG TGTCCTAAGG CGAGCTCAGG GAGGACAGAAACCTCCCGTG GAGCAGAAGG 12180 GCAAAAGCTC GCTTGATCTT GATTTTCAGT ACGAATACAGACCGTGAAAG CGGGGCCTCA 12240 CGATCCTTCT GACCTTTTGG GTTTTAAGCA GGAGGTGTCAGAAAAGTTAC CACAGGGATA 12300 ACTGGCTTGT GGCGGCCAAG CGTTCATAGC GACGTCGCTTTTTGATCCTT CGATGTCGGC 12360 TCTTCCTATC ATTGTGAAGC AGAATTCGCC AAGCGTTGGATTGTTCACCC ACTAATAGGG 12420 AACGTGAGCT GGGTTTAGAC CGTCGTGAGA CAGGTTAGTTTTACCCTACT GATGATGTGT 12480 TGTTGCCATG GTAATCCTGC TCAGTACGAG AGGAACCGCAGGTTCAGACA TTTGGTGTAT 12540 GTGCTTGGCT GAGGAGCCAA TGGGGCGAAG CTACCATCTGTGGGATTATG ACTGAACGCC 12600 TCTAAGTCAG AATCCCGCCC AGGCGAACGA TACGGCAGCGCCGCGGAGCC TCGGTTGGCC 12660 TCGGATAGCC GGTCCCCCGC CTGTCCCCGC CGGCGGGCCGCCCCCCCCTC CACGCGCCCC 12720 GCCGCGGGAG GGCGCGTGCC CCGCCGCGCG CCGGGACCGGGGTCCGGTGC GGAGTGCCCT 12780 TCGTCCTGGG AAACGGGGCG CGGCCGGAAA GGCGGCCGCCCCCTCGCCCG TCACGCACCG 12840 CACGTTCGTG GGGAACCTGG CGCTAAACCA TTCGTAGACGACCTGCTTCT GGGTCGGGGT 12900 TTCGTACGTA GCAGAGCAGC TCCCTCGCTG CGATCTATTGAAAGTCAGCC CTCGACACAA 12960 GGGTTTGTCC GCGCGCGCGT GCGTGCGGGG GGCCCGGCGGGCGTGCGCGT TCGGCGCCGT 13020 CCGTCCTTCC GTTCGTCTTC CTCCCTCCCG GCCTCTCCCGCCGACCGCGG CGTGGTGGTG 13080 GGGTGGGGGG GAGGGCGCGC GACCCCGGTC GGCCGCCCCGCTTCTTCGGT TCCCGCCTCC 13140 TCCCCGTTCA CGCCGGGGCG GCTCGTCCGC TCCGGGCCGGGACGGGGTCC GGGGAGCGTG 13200 GTTTGGGAGC CGCGGAGGCG CCGCGCCGAG CCGGGCCCCGTGGCCCGCCG GTCCCCGTCC 13260 CGGGGGTTGG CCGCGCGGCG CGGTGGGGGG CCACCCGGGGTCCCGGCCCT CGCGCGTCCT 13320 TCCTCCTCGC TCCTCCGCAC GGGTCGACCG ACGAACCGCGGGTGGCGGGC GGCGGGCGGC 13380 GAGCCCCACG GGCGTCCCCG CACCCGGCCG ACCTCCGCTCGCGACCTCTC CTCGGTCGGG 13440 CCTCCGGGGT CGACCGCCTG CGCCCGCGGG CGTGAGACTCAGCGGCGTCT CGCCGTGTCC 13500 CGGGTCGACC GCGGCCTTCT CCACCGAGCG GCGGTGTAGGAGTGCCCGTC GGGACGAACC 13560 GCAACCGGAG CGTCCCCGTC TCGGTCGGCA CCTCCGGGGTCGACCAGCTG CCGCCCGCGA 13620 GCTCCGGACT TAGCCGGCGT CTGCACGTGT CCCGGGTCGACCAGCAGGCG GCCGCCGGAC 13680 GCAGCGGCGC ACGCACGCGA GGGCGTCGAT TCCCCTTCGCGCGCCCGCGC CTCCACCGGC 13740 CTCGGCCCGC GGTGGAGCTG GGACCACGCG GAACTCCCTCTCCCACATTT TTTTCAGCCC 13800 CACCGCGAGT TTGCGTCCGC GGGACCTTTA AGAGGGAGTCACTGCTGCCG TCAGCCAGTA 13860 CTGCCTCCTC CTTTTTCGCT TTTAGGTTTT GCTTGCCTTTTTTTTTTTTT TTTTTTTTTT 13920 TTTTTTCTTT CTTTCTTTCT TTCTTTCTTT CTTTCTTTCTTTCTTTCTTT CGCTTGTCTT 13980 CTTCTTGTGT TCTCTTCTTG CTCTTCCTCT GTCTGTCTCTCTCTCTCTCT CTCTCTCTGT 14040 CTCTCGCTCT CGCCCTCTCT CTCTTCTCTC TCTCTCTCTCTCTCTCTCTG TCTCTCGCTC 14100 TCGCCCTCTC TCTCTCTCTT CTCTCTGTCT CTCTCTCTCTCTCTCTCTCT CTCTCTCTCT 14160 GTCGCTCTCG CCCTCTCGCT CTCTCTCTGT CTCTGTCTGTGTCTCTCTCT CTCCCTCCCT 14220 CCCTCCCTCC CTCCCTCCCT CCCTCCCCTT CCTTGGCGCCTTCTCGGCTC TTGAGACTTA 14280 GCCGCTGTCT CGCCGTACCC CGGGTCGACC GGCGGGCCTTCTCCACCGAG CGGCGTGCCA 14340 CAGTGCCCGT CGGGACGAGC CGGACCCGCC GCGTCCCCGTCTCGGTCGGC ACCTCCGGGG 14400 TCGACCAGCT GCCGCCCGCG AGCTCCGGAC TTAGCCGGCGTCTGCACGTG TCCCGGGTCG 14460 ACCAGCAGGC GGCCGCCGGA CGCAGCGGCG CACCGACGGAGGGCGCTGAT TCCCGTTCAC 14520 GCGCCCGCGC CTCCACCGGC CTCGGCCCGC CGTGGAGCTGGGACCACGCG GAACTCCCTC 14580 TCCTACATTT TTTTCAGCCC CACCGCGAGT TTGCGTCCGCGGGACCTTTA AGAGGGAGTC 14640 ACTGCTGCCG TCAGCCAGTA CTGCCTCCTC CTTTTTCGCTTTTAGGTTTT GCTTGCCTTT 14700 TTTTTTTTTT TTTTTTTTTT TTTTTTCTTT CTTTCTTTCTTTCTTTCTTT CTTTCTTTCT 14760 TTCTTTCTTT CTTTCGCTCT CGCTCTCTCG CTCTCTCCCTCGCTCGTTTC TTTCTTTCTC 14820 TTTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTGTCTCTCGCTCTCGCCC TCTCTCTCTC 14880 TTTCTCTCTC TCTCTGTCTC TCTCTCTCTC TCTCTCTCTCTCTCTCTCTC CCTCCCTCCC 14940 TCCCCCTCCC TCCCTCTCTC CCCTTCCTTG GCGCCTTCTCGGCTCTTGAG ACTTAGCCGC 15000 TGTCTCGCCG TGTCCCGGGT CGACCGGCGG GCCTTCTCCACCGAGCGGCG TGCCACAGTG 15060 CCCGTCGGGA CGAGCCGGAC CCGCCGCGTC CCCGTCTCGGTCGGCACCTC CGGGGTCGAC 15120 CAGCTGCCGC CCGCGAGCTC CGGACTTAGC CGGCGTCTGCACGTGTCCCG GGTCGACCAG 15180 CAGGCGGCCG CCGGACGCTG CGGCGCACCG ACGCGAGGGCGTCGATTCCG GTTCACGCGC 15240 CGGCGACCTC CACCGGCCTC GGCCCGCGGT GGAGCTGGGACCACGCGGAA CTCCCTCTCC 15300 CACATTTTTT TCAGCCCCAC CGCGAGTTTG CGTCCGCGGGACTTTTAAGA GGGAGTCACT 15360 GCTGCCGTCA GCCAGTAATG CTTCCTCCTT TTTTGCTTTTTGGTTTTGCC TTGCGTTTTC 15420 TTTCTTTCTT TCTTTCTTTC TTTCTTTCTT TCTTTCTTTCTCTCTCTCTC TCTCTCTCTC 15480 TCTCTGTCTC TCTCTCTCTG TCTCTCTCCC CTCCCTCCCTCCTTGGTGCC TTCTCGGCTC 15540 GCTGCTGCTG CTGCCTCTGC CTCCACGGTT CAAGCAAACAGCAAGTTTTC TATTTCGAGT 15600 AAAGACGTAA TTTCACCATT TTGGCCGGGC TGGTCTCGAACTCCCGACCT AGTGATCCGC 15660 CCGCCTCGGC CTCCCAAAGA CTGCTGGGAG TACAGATGTGAGCCACCATG CCCGGCCGAT 15720 TCCTTCCTTT TTTCAATCTT ATTTTCTGAA CGCTGCCGTGTATGAACATA CATCTACACA 15780 CACACACACA CACACACACA CACACACACA CACACACACACACACACCCC GTAGTGATAA 15840 AACTATGTAA ATGATATTTC CATAATTAAT ACGTTTATATTATGTTACTT TTAATGGATG 15900 AATATGTATC GAAGCCCCAT TTCATTTACA TACACGTGTATGTATATCCT TCCTCCCTTC 15960 CTTCATTCAT TATTTATTAA TAATTTTCGT TTATTTATTTTCTTTTCTTT TGGGGCCGGC 16020 CCGCCTGGTC TTCTGTCTCT GCGCTCTGGT GACCTCAGCCTCCCAAATAG CTGGGACTAC 16080 AGGGATCTCT TAAGCCCGGG AGGAGAGGTT AACGTGGGCTGTGATCGCAC ACTTCCACTC 16140 CAGCTTACGT GGGCTGCGGT GCGGTGGGGT GGGGTGGGGTGGGGTGGGGT GCAGAGAAAA 16200 CGATTGATTG CGATCTCAAT TGCCTTTTAG CTTCATTCATACCCTGTTAT TTGCTCGTTT 16260 ATTCTCATGG GTTCTTCTGT GTCATTGTCA CGTTCATCGTTTGCTTGCCT GCTTGCCTGT 16320 TTATTTCCTT CCTTCCTTCC TTCCTTCCTT CCTTCCTTCCTTCCTTCCTT CCCTCCCTTA 16380 CTGGCAGGGT CTTCCTCTGT CTCTGCCGCC CAGGATCACCCCAACCTCAA CGCTTTGGAC 16440 CGACCAAACG GTCGTTCTGC CTCTGATCCC TCCCATCCCCATTACCTGAG ACTACAGGCG 16500 CGCACCACCA CACCGGCTGA CTTTTATGTT GTTTCTCATGTTTTCCGTAG GTAGGTATGT 16560 GTGTGTGTGT GTGTGTGTGT GTGTGTGTGT GTGTGTGTGTGTGTGTGTGT GTGTGTATCT 16620 ATGTATGTAC GTATGTATGT ATGTATGTGA GTGAGATGGGTTTCGGGGTT CTATCATGTT 16680 GCCCACGCTG GTCTCGAACT CCTGTCCTCA AGCAATCCGCCTGCCTGCCT CGGCCGCCCA 16740 CACTGCTGCT ATTACAGGCG TGAGACGCTG CGCCTGGCTCCTTCTACATT TGCCTGCCTG 16800 CCTGCCTGCC TGCCTGCCTA TCAATCGTCT TCTTTTTAGTACGGATGTCG TCTCGCTTTA 16860 TTGTCCATGC TCTGGGCACA CGTGGTCTCT TTTCAAACTTCTATGATTAT TATTATTGTA 16920 GGCGTCATCT CACGTGTCGA GGTGATCTCG AACTTTTAGGCTCCAGAGAT CCTCCCGCAT 16980 CGGCCTCCCG GAGTGCTGTG ATGACACGCG TGGGCACGGTACGCTCTGGT CGTGTTTGTC 17040 GTGGGTCGGT TCTTTCCGTT TTTAATACGG GGACTGCGAACGAAGAAAAT TTTCAGACGC 17100 ATCTCACCGA TCCGCCTTTT CGTTCTTTCT TTTTATTCTCTTTAGACGGA GTTTCACTCT 17160 TGTCGCCCAG GGTGGAGTAC GATGGCGGCT CTCGGCTCACCGCACCCTCC GCCTCCCAGG 17220 TTCAAGTGAT TCTCCTGCCT CAGCCTTCCC GAGTAGCTGGAATGACAGAG ATGAGCCATC 17280 GTGCCCGGCT AATTTTTCTA TTTTTAGTAC AGATGGGGTTTCTCCATCTT GGTCAGGCTG 17340 GTCTTCAACT TCCGACCGTT GGAGAATCTT AACTTTCTTGGTGGTGGTTG TTTTCCTTTT 17400 TCTTTTTTTT TCTTTTCTTT TCTTTCCTTC TCCTCCCCCCCCCACCCCCC TTGTCGTCGT 17460 CCTCCTCCTC CTCCTCCTCC TCCTCCTCCT CCTCCTCCTCCTCCTCCTCC TCTTTCATTT 17520 CTTTCAGCTG GGCTCTCCTA CTTGTGTTGC TCTGTTGCTCACGCTGGTCT CAAACTCCTG 17580 GCCTTGACTC TTCTCCCGTC ACATCCGCCG TCTGGTTGTTGAAATGAGCA TCTCTCGTAA 17640 AATGGAAAAG ATGAAAGAAA TAAACACGAA GACGGAAAGCACGGTGTGAA CGTTTCTCTT 17700 GCCGTCTCCC GGGGTGTACC TTGGACCCGG AAACACGGAGGGAGCTTGGC TGAGTGGGTT 17760 TTCGGTGCCG AAACCTCCCG AGGGCCTCCT TCCCTCTCCCCCTTGTCCCC GCTTCTCCGC 17820 CAGCCGAGGC TCCCACCGCC GCCCCTGGCA TTTTCCATAGGAGAGGTATG GGAGAGGACT 17880 GACACGCCTT CCAGATCTAT ATCCTGCCGG ACGTCTCTGGCTCGGCGTGC CCCACCGGCT 17940 ACCTGCCACC TTCCAGGGAG CTCTGAGGCG GATGCGACCCCCACCCCCCC GTCACGTCCC 18000 GCTACCCTCC CCCGGCTGGC CTTTGCCGGG CGACCCCAGGGGAACCGCGT TGATGCTGCT 18060 TCGGATCCTC CGGCGAAGAC TTCCACCGGA TGCCCCGGGTGGGCCGGTTG GGATCAGACT 18120 GGACCACCCC GGACCGTGCT GTTCTTGGGG GTGGGTTGACGTACAGGGTG GACTGGCAGC 18180 CCCAGCATTG TAAAGGGTGC GTGGGTATGG AAATGTCACCTAGGATGCCC TCCTTCCCTT 18240 CGGTCTGCCT TCAGCTGCCT CAGGCGTGAA GACAACTTCCCATCGGAACC TCTTCTCTTC 18300 CCTTTCTCCA GCACACAGAT GAGACGCACG AGAGGGAGAAACAGCTCAAT AGATACCGCT 18360 GACCTTCATT TGTGGAATCC TCAGTCATCG ACACACAAGACAGGTGACTA GGCAGGGACA 18420 CAGATCAAAC ACTATTTCCG GGTCCTCGTG GTGGGATTGGTCTCTCTCTC TCTCTCTCTC 18480 TCTCTCTCTC TCTCTCTCTC TCTCGCACGC GCACGCGCGCACACACACAC ACAATTTCCA 18540 TATCTAGTTC ACAGAGCACA CTCACTTCCC CTTTTCACAGTACGCAGGCT GAGTAAAACG 18600 CGCCCCACCC TCCACCCGTT GGCTGACGAA ACCCCTTCTCTACAATTGAT GAAAAAGATG 18660 ATCTGGGCCG GGCACGCTAG CTCACGCCTG TCACTCCGGCACTTTGGGAG GCCGAGGCGG 18720 GTGGATCGCT TGGGGCCGGG AGTTCGAGAC CAGGCTGGCCGACGTGGCGA AACCCCGTCT 18780 CTCTGAAAAA TAGAACGATT AGCCGGGCCT GGTGGCGTGGGCTTGGAATC ACGACCGCTC 18840 GGGAGACTGG GGCGGGCGAC TTGTTCCAAC CGGGGAGGCCGAGGCCGCGA TGAGCTGAGA 18900 TCGTGCCGTG GCGATGCGGC CTGGATGACG GAGCGAGACCCCGTCTCGAG AGAATCATGA 18960 TGTTATTATA AGATGAGTTG TGCGCGGTGA TGGCCGCCTGTAGTCGCGGC TACTCGGGAG 19020 GCTGAGACGA GGAGAAGATC ACTTGAGGCC CCACAGGTCGAGGCTTCGGT CGGCCGTGAC 19080 CCACTGTATC CTGGGCAGTC ACCGGTCAAG GAGATATGCCCCTTCCCCGT TTGCTTTTCT 19140 TTTCTTCCCT TCTCTTTTCT TCTTTTTGCT TCTCTTTTCTTTCTTTCTTT CTTTCTTTCT 19200 TTCTTTCTTT CTTTCTTTCT TTTTCTTTTT CTCTCTTCCCCTCTTTCTTT CCTGCCTTCC 19260 TGCCTTTCTT CTTTTCTTCT TTCCTCCCTT CCTCCCTTCCTTCTTTCCTC CCGCCTCAGC 19320 CTCCCAAAGT GCTGGGATGA CTGGCGGGAG GCACCATGCCTGCTTGGCCC AAAGAGACCC 19380 TCTTGGAAAG TGAGACGCAG AGAGCGCCTT CCAGTGATCTCATTGACTGA TTTAGAGACG 19440 GCATCTCGCT CCGTCACCCC GGCAGTGGTG CCGTCGTAACTCACTCCCTG CAGCGTGGAC 19500 GCTCCTGGAC TCGAGCGATC CTTCCACCTC AGCCTCCAGAGTACAGAGCC TGGGACCGCG 19560 GGCACGCGCC ACTGTGCCCA CACCGTTTTT AATTGTTTTTTTTTCCCCCG AGACAGAGTT 19620 TCACTCTCGT GGCCTAGACT GCAGTGCGGT GGCGCGATCTTGGCTCACCG CAACCTCTGC 19680 CTCCCGGTTT CAAGCGATTC TCCTGCATCG GCCTCCTGAGTAGCCGGGAT TGCGGGCATG 19740 CGCTGCCACG TCTGGCTGAT TTCGTATTTT TAGTGGAGACGGGGCTTCTC CATGTCGATC 19800 GGGCTGGTTT CGAACTCCCG ACCTCAGGTG ATCCGCCCTCCCCGGCCTCC GGAAGTGCTG 19860 GGATGACAGG CGTGAGCCAC CGCGCCCGGC CTTCATTTTTAAATGTTTTC CCACAGACGG 19920 GGTCTCATCA TTTCTTTGCA ACCCTCCTGC CCGGCGTCTCAAAGTGCTGG CGTGACGGGC 19980 GTGAGCCACT GCGCCTGGAC TCCGGGGAAT GACTCACGACCACCATCGCT CTACTGATCC 20040 TTTCTTTCTT TCTTTCTTTC TTTCTTTCTT TCTTTCTTTCTTTCTTTCTT TCTTTCTTGA 20100 TGAATTATCT TATGATTTAT TTGTGTACTT ATTTTCAGACGGAGTCTCGC TCTGGGCGGG 20160 GCGAGGCGAG GCGAGGCACA GCGCATCGCT TTGGAAGCCGCGGCAACGCC TTTCAAAGCC 20220 CCATTCGTAT GCACAGAGCC TTATTCCCTT CCTGGAGTTGGAGCTGATGC CTTCCGTAGC 20280 CTTGGGCTTC TCTCCATTCG GAAGCTTGAC AGGCGCAGGGCCACCCAGAG GCTGGCTGCG 20340 GCTGAGGATT AGGGGGTGTG TTGGGGCTGA AAACTGGGTCCCCTATTTTT GATACCTCAG 20400 CCGACACATC CCCCGACCGC CATCGCTTGC TCGCCCTCTGAGATCCCCCG CCTCCACCGC 20460 CTTGCAGGCT CACCTCTTAC TTTCATTTCT TCCTTTCTTGCGTTTGAGGA GGGGGTGCGG 20520 GAATGAGGGT GTGTGTGGGG AGGGGGTGCG GGGTGGGGACGGAGGGGAGC GTCCTAAGGG 20580 TCGATTTAGT GTCATGCCTC TTTCACCACC ACCACCACCACCGAAGATGA CAGCAAGGAT 20640 CGGCTAAATA CCGCGTGTTC TCATCTAGAA GTGGGAACTTACAGATGACA GTTCTTGCAT 20700 GGGCAGAACG AGGGGGACCG GGGACGCGGA AGTCTGCTTGAGGGAGGAGG GGTGGAAGGA 20760 GAGACAGCTT CAGGAAGAAA ACAAAACACG AATACTGTCGGACACAGCAC TGACTACCCG 20820 GGTGATGAAA TCATCTGCAC ACTGAACACC CCCGTCACAAGTTTACCTAT GTCACAATCT 20880 TGCACATGTA TCGCTTGAAC GACAAATAAA AGTTAGGGGGGAGAAGAGAG GAGAGAGAGA 20940 GAGAGAGAGA GACAGAGAGA GACAGAGAGA GAGAGAGAGGAGGGAGAGAG GAAAACGAAA 21000 CACCACCTCC TTGACCTGAG TCAGGGGGTT TCTGGCCTTTTGGGAGAACG TTCAGCGACA 21060 ATGCAGTATT TGGGCCCGTT CTTTTTTTTT CTTCTTCTTTTCTTTCTTTT TTTTTGGACT 21120 GAGTCTCTCT CGCTCTGTCA CCCAGGCTGC GGTCGCGGTGGCGCTCTCTC GGCTCACTGA 21180 AACCTCTGCT TCCCGGGTTC CAGTGATTCT TCTTCGGTAGCTGGGATTAC AGGCGCACAC 21240 CATGACGGCG GGCTCATATT CCTATTTTCA GTAGAGACGGGGTTTCTCCA CGTTGGCCAC 21300 GCTGGTCTCG AACTCCTGAC CTCAAATGAT CCGCCTTCCTGGGCCTCCCA AAGTGCTGGA 21360 AACGACAGGC CTGAGCCGCC GGGATTTCAG CCTTTAAAAGCGCGGCCCTG CCACCTTTCG 21420 CTGTGGCCCT TACGCTCAGA ATGACGTGTC CTCTCTGCCGTAGGTTGACT CCTTGAGTCC 21480 CCTAGGCCAT TGCACTGTAG CCTGGGCAGC AAGAGCCAAACTCCGNNCCC CCACCTCCTC 21540 GCGCACATAA TAACTAACTA ACAAACTAAC TAACTAACTAAACTAACTAA CTAACTAAAA 21600 TCTCTACACG TCACCCATAA GTGTGTGTTC CCGTGAGAGTGATTTCTAAG AAATGGTACT 21660 GTACACTGAA CGCAGTGGCT CACGTCTGTC ATCCCGAGGTCAGGAGTTCG AGACCAGCCC 21720 GGCCAACGTG GTGAAACCCC GTCTCTACTG AAAATACGAAATGGAGTCAG GCGCCGTGGG 21780 GCAGGCACCT GTAACCCCAG CTACTCGGGA GGCTGGGGTGGAAGAATTGC TTGAACCTGG 21840 CAGGCGGAGG CTGCAGTGAC CCAAGATCGC ACCACTGCACTACAGCCTGG GCGACAGAGT 21900 GAGACCCGGT CTCCAGATAA ATACGTACAT AAATAAATACACACATACAT ACATACATAC 21960 ATACATACAT ACATACATAC ATCCATGCAT ACAGATATACAAGAAAGAAA AAAAGAAAAG 22020 AAAAGAAAGA GAAAATGAAA GAAAAGGCAC TGTATTGCTACTGGGCTAGG GCCTTCTCTC 22080 TGTCTGTTTC TCTCTGTTCG TCTCTGTCTT TCTCTCTGTGTCTCTTTCTC TGTCTGTCTG 22140 TCTCTTTCTT TCTCTCTGTC TCTGTCTCTG TCTTTGTCTCTCTCTCTCCC TCTCTGCCTG 22200 TCTCACTGTG TCTGTCTTCT GTCTTACTCT CTTTCTCTCCCCGTCTGTCT CTCTCTCTCT 22260 CTCTCCCTCC CTGTTTGTTT CTCTCTCTCC CTCCCTGTCTGTTTCTCTCT CTCTCTTTCT 22320 GTCTGTTTCT GTCTCTCTCT GTCTGTCTAT GTCTTTCTCTGTCTGTCTCT TTCTCTGTCT 22380 GTCTGCCTCT CTCTTTCTTT TTCTGTGTCT CTCTGTCGGTCTCTCTCTCT CTGTCTGTCT 22440 GTCTGTCTCT CTCTCTCTCT CTCTGTGCCT ATCTTCTGTCTTACTCTCTT TCTCTGCCTG 22500 TCTGTCTGTC TCTCCCTCCC TTTCTGTTTC TCTCTCTCTCTCTCTCTCTC TCCCCCTCTC 22560 CCTGTCTGTT TCTCTCCGTC TCTCTCTCTT TCTGTCTGTTTCTCACTGTC TCTCTCTGTC 22620 CATCTCTCTC TCTCTCTGTC TGTCTCTTTC GTTCTCTCTGTCTGTCTGTC TCTCTCTCTC 22680 TCTCTCTCTC TCTCTCTCTC TCCCTGTCTG TCTGTTTCTCTCTATCTCTC GCTGTCCATC 22740 TCTGTCTTTC TATGTCTGTC TCTTTCTCTG TCAGTCTGTCAGACACCCCC GTGCCGGGTA 22800 GGGCCCTGCC CCTTCCACGA AAGTGAGAAG CGCGTGCTTCGGTGCTTAGA GAGGCCGAGA 22860 GGAATCTAGA CAGGCGGGCC TTGCTGGGCT TCCCCACTCGGTGTATGATT TCGGGAGGTC 22920 GAGGCCGGGT CCCCGCTTGG ATGCGAGGGG CATTTTCAGACTTTTCTCTC GGTCACGTGT 22980 GGCGTCCGTA CTTCTCCTAT TTCCCCGATA AGCTCCTCGACTTCAACATA AACGGCGTCC 23040 TAAGGGTCGA TTTAGTGTCA TGCCTCTTTC ACCGCCACCACCGAAGATGA AAGCAAAGAT 23100 CGGCTAAATA CCGCGTGTTC TCATCTAGAA GTGGGAACTTACAGATGACA GTTCTTGCAT 23160 GGGCAGAACG AGGGGGACCG GGNACGCGGA AGCCTGCTTGAGGGRGGAGG GGYGGAAGGA 23220 GAGACAGCTT CAGGAAGAAA ACAAAACACG AATACTGTCGGACACAGCAC TGACTACCCG 23280 GGTGATGAAA TCATCTGCAC ACTGAACACC CCCGTCACAAGTTTACCTAT GTCACAGTCT 23340 TGCTCATGTA TGCTTGAACG ACAAATAAAA GTTCGGGGGGGAGAAGAGAG GAGAGAGAGA 23400 GAGAGACGGG GAGAGAGGGG GGAGAGGGGG GGGGAGAGAGAGAGAGAGAG AGAGAGAGAG 23460 AGAGAGAGAG AGAAAGAGAA GTAAAACCAA CCACCACCTCCTTGACCTGA GTCAGGGGGT 23520 TTCTGGCCTT TTGGGAGAAC GTTCAGCGAC AATGCAGTATTTGGGCCCGT TCTTTTTTTC 23580 TTCTTCTTCT TTTCTTTCTT TTTTTTTGGA CTGAGTCTCTCTCGCTCTGT CACCCAGGCT 23640 GCGGTGCGGT GGCGCTCTCT CGGCTCACTG AAACCTCTGCTTCCCGGGTT CCAGTGATTC 23700 TTCTTCGGTA GCTGGGATTA CAGGTGCGCA CCATGACGGCCGGCTCATCG TTCTATTTTT 23760 AGTAGAGACG GGGTTTCTCC ACGTTGGCCA CGCTGGTCTCGAACTCCTGA CCACAAATGA 23820 TCCACCTTCC TGGGCCTCCC AAAGTGCTGG AAACGACAGGCCTGAGCCGC CGGGATTTCA 23880 GCCTTTAAAA GCGCGCGGCC CTGCCACCTT TCGCTGCGGCCCTTACGCTC AGAATGACGT 23940 GTCCTCTCTG CCATAGGTTG ACTCCTTGAG TCCCCTAGGCCATTGCACTG TAGCCTGGGC 24000 AGCAAGAGCC AAACTCCGTC CCCCCACCTC CCCGCGCACATAATAACTAA CTAACTAACT 24060 AACTAACTAA AATCTCTACA CGTCACCCAT AAGTGTGTGTTCCCGTGAGG AGTGATTTCT 24120 AAGAAATGGT ACTGTACACT GAACGCAGGC TTCACGTCTGTCATCCCGAG GTCAGGAGTT 24180 CGAGACCAGC CCGGCCCACG TGGTGAAACC CCCGTCTCTACTGAAAATAC GAAATGGAGT 24240 CAGGCGCCGT GGGGCAGGCA CCTGTAACCC CAGCTACTCGGGAGGCTGGG GTGGAAGAAT 24300 TGCTTGAACC TGGCAGGCGG AGGCTGCAGT GACCCAAGATCGCACCACTG CACTACAGCC 24360 TGGGCGACAG AGTGAGACCC GGTCTCCAGA TAAATACGTACATAAATAAA TACACACATA 24420 CATACATACA TACATACAAC ATACATACAT ACAGATATACAAGAAAGAAA AAAAGAAAAG 24480 AAAAGAAAGA GAAAATGAAA GAAAAGGCAC TGTATTGCTACTGGGCTAGG GCCTTCTCTC 24540 TGTCTGTTTC TCTCTGTTCG TCTCTGTCTT TCTCTCTGTGTCTCTTTCTC TGTCTGTCTG 24600 TCTGTCTGTC TGTCTGTCTC TTTCTTTCTT TCTGTCTCTGTCTTTGTCCC TCTCTCTCCC 24660 TCTCTGCCCT GTCTCACTGT GTCTGTCTTC TATCTTACTCTCTTTCTCTC CCCGTCTGTC 24720 TCTCTCTCAC TCCCTCCCTG TCTGTTTCTC TCTCTCTCTCTTTCTGTCTG TTTCTGTCTC 24780 TCTCTGTCTG CCTCTCTCTT TCTCTATCTG TCTCTTTCTCTGTCTGTCTG CCCCTCTCTT 24840 TCTTTTTCTG TGTCTCTCTG TCTGTCTCTC TCTCTCTCTGTGCCTATCTT CTGTCTTACT 24900 CTCTTTCTCT GCCTGTCTGT CTGTCTCTCT CTGTCTCTCCCTCCCTTTCT GCTTCTCTCT 24960 CTCTCTCTCT CTCTNNNCCC TCCCTGTCTG TTTCTCTCTGTCTCCCTCTC TTTCTGTCTG 25020 TTTCTCACTG TCTCTCTCTG TCTGTCTGTT TCATTCTCTCTGTCTCTGTC TCTGTCTCTC 25080 TCTCTCTCTG TCTCTCCCTC TCTGTGTGTA TCTTTTGTCTTACTCTCCTT CTCTGCCTGT 25140 CCGTCTGTCT GTCTGTCTCT CTCTCTCCCT GTCCCTCTCTCTTTCTGTCT GTTTCTCTCT 25200 CTCTCTCTCT CTCTCTCTCT CTGTCTCTGT CTTTCTCTGTCTGTCCCTTT CTCTGTCTGT 25260 CTGCCTCTCT CTTTCTCTTT CTGTGTCTCT CTGTCTCTCTCTCTGTGCCT ATCTTCTGTC 25320 TTACTCTCTT TCTCTGCCTG TCTATCTGTC TGTCTCTCTCTGTCTCTCTC CCTGCCTTTC 25380 TGTTTCTCTC TCTCTCCCTC TCTCGCTCTC TCTGTCTTTCTCTCTTTCTC TCTGTTTCTC 25440 TGTCTCTCTC TGTCCGTCTC TGTCTTTTTC TGTCTGTCTGTCTCTCTCTT TCTTTCTGTC 25500 GTCTGTCTCT GTCTCTGTCT CTGTCTCTCT CTCTCTCTCTCTCCTTGTCT CTCTCACTGT 25560 GTCTGTCTTC TGTCTTACTC TCCTTCTCTG CCTGTCCATCTGTCTGTCTG TCTCTCTCTC 25620 TCTCTCCCTA CCTTTCTGTT TCTCTCTCGC TAGCTCTCTCTCTCTCTGCC TGTTTCTCTC 25680 TTTCTCTCTC TGTCTTTCTC TGTCTGTCTC TTTCTCTGTCTGTCTGTCTC TTTCTCTCTG 25740 TCTCTGTCTC TGTCTCTCTC TCTCTCTCTC TCTCTCTCTCTGCCTCTCTC ACTGTGTCTG 25800 TCTTCTGTCT TATTCTCTTT CTCTCTCTGT CTCTCTCTCTCTCTCCTTTA CTGTCTGTTT 25860 CTCTCTCTCT CTCTCTCTTT CTGCCTGTTT CTCTCTGTCTGTCTCTGTCT TTCTCTGTCT 25920 GTCTGCCTCT CTCTTTCTTT TTCTGCGTCT CTCTGTCTCTCTCTCTCTCT CTCTGTTCCT 25980 ATCTTCTGTC TTACTCTGTT TCCTTGCCTG CCTGCCTGTCTGTGTGTCTG TCTCTCTCTC 26040 TCTCTCTCTC TCTCTCTCCC TCCCTTTCTC TTTCTCTGTCTCTCTCTCTC TTTCTGGGTG 26100 TTTCTCTCTG TCTCTCTGTC CATCTCTGTC TTTCTATGTCTGTCTCTCTC TTTCTCTCTG 26160 TCTCTGTCTC TGCCTCTCTC TCTCTCTCTC TCTCTCTCTCTCTGTCTGTC TCTCTCACTG 26220 TGTGTGTCTG TCTTCTGTCT TACTCTCCTT CTCTGCCTGTCCGTCTGTCT GTCTGTCTCT 26280 CCCTCTCTCT CCCTCCCTTT CTGTTTCTCT CTCTCTCTCTTTCTGTCTGT TTCTCTCTTT 26340 CTCTCTCTGT CTGTCTCTTT CTCTGTCTGT CTGTCTCTCTCTTTCTTTTT CTCTGTCTCT 26400 CTGTCTCTCT CTGTGTCTGT CTCTCTGTCT GTGCCTATCTTCTGTCTTAC TCTCTTTCTC 26460 TGGCTGTCTG CCTGTCTCTC TCTCTCTCTC TGTCTGTCTCCGTCCCTCTC TCCCTGTCTG 26520 TCTGTTTCTC TCTCTGCCTC TCTCTCTCTC TGTCTGTCTCTTTCTCTGTC TGTCTGTCTC 26580 TCTCTTTCTT TTTCTCTGTC TCTCTGTCTC TCTCTGTGTCTGTCTCTCTT TCTGTG 26640 TCTTCTGTCT TACTCTCTTT CTCTGGCTGT CTGCCTGTCTCTCTCTCTCT GCCTGT 26700 GTCCCTCCCT CCCTGTCTGT CTGTTTCTCT CTCTGTCTCTGTCTCTCTGT CCATCT 26760 CTGTCTCTTT CTCTTTCTCT CTCTCTGTCT CTGTCTCTCTCTCTCTCTGC CTGTCT 26820 CACTGTGTCT GTCTTCTGTC TTACTCTCTT TCTCTTGCCTGCCTCTCTGT CTGTCT 26880 CTCTCCCTCC ATGTCTCTCT CTCTCTCTCA CTCACTCTCTCTCCGTCTCT CTCTCT 26940 GTCTGTTTCT CTCTCTGTCT GTCTCTCTCC CTCCATGTCTCTCTCTCTCT CTCTCA 27000 CTCTCTCTCC GTCTCTCTCT CTCTTTCTGT CTGTTTCTCTCTCTGTCTGT CTCTCT 27060 CCATGTCTCT CTCTCTCCCT CTCACTCACT CTCTCTCCGTCTCTCTCTCT CTTTCT 27120 GTTTCTTTGT CTGTCTGTCT GTCTGTCTGT CTGTCTCTCTCTCTCTCTCT CTCTCT 27180 CTCTCTGTTT GTCTTTCTCC CTCCCTGTCT GTCTGTCTGTCTCTCTCTCT CTGTCT 27240 CTCTGTCTCT CTCTCTTTCT CTTTCTGTCT GTTTCTCTCTATCTCTCGCT GTCCAT 27300 GTCTTTCTAT GTCTGTCTCT TTCTCTGTCA GTCTGTCAGACACACCCGTG CCGGTA 27360 CCTGCCCTTC CACGAGAGTG AGAAGCGCGT GCTTCGGTGCTTAGAGAGGC CGAGAG 27420 CTAGACAGGC GGGCCTTGCT GGGCTTCCCC ACTCGGTGTACGATTTCGGG AGGTCG 27480 CGGGTCCCCG CTTGGATGCG AGGGGCATTT TCAGACTTTTCTCTCGGTCA CGTGTG 27540 CCGTACTTCT CCTATTTCCC CGATAAGTCT CCTCGACTTCAACATAAACT GTTAAG 27600 GACGCCAACA CGGCGAAACC CCGTCTCTAC TAAAAATACAAAGCTGAGTC GGGAGC 27660 GGGCAGGCCC TGTAATGCCA GCTCCTCGGG AGGCTGAGGCGGGAGAATCG CTTGAA 27720 GGAAGCGGAG GCTGCAGGGA GCCGAGATCG CGCCACTGCACTACGGCCCA GGCTGT 27780 TGAGTGAGAC TCGGTCTCTA AATAAATACG GAAATTAATTAATTCATTAA TTCTTT 27840 TGCTGACGGA CATTTGCAGG CAGGCATCGG TTGTCTTCGGGCATCACCTA GCGGCC 27900 TTATTGAAAG TCGACGTTGA CACGGAGGGA GGTCTCGCCGACTTCACCGA GCCTGG 27960 ACGGGTTTCT CTCTCTCCCT TCTGGAGGCC CCTCCCTCTCTCCCTCGTTG CCTAGG 28020 CTCGCCTAGG GAACCTCCGC CCTGGGGGCC CTATTGTTCTTTGATCGGCG CTTTAC 28080 CTTTGTGTTT TGGCGCCTAG ACTCTTCTAC TTGGGCTTTGGGAAGGGTCA GTTTAA 28140 CAAGTTGCCC CCCGGCTCCC CCCACTACCC ACGTCCCTTCACCTTAATTT AGTGAG 28200 TTAGGTGGGT TTCCCCCAAA CCGCCCCCCC CCCCCCGCCTCCCAACACCC TGCTTG 28260 CCTTCCAGAG CCACCCCGGT GTGCCTCCGT CTTCTCTCCCCTTCCCCCAC CCCTTG 28320 CGATCTCATT CTTGCCAGGC TGACATTTGC ATCGGTGGGCGTCAGGCCTC ACTCGG 28380 CACCGTTTTT GAAGATGGGG GCGGCACGGT CCCACTTCCCCGGAGGCAGC TTGGGC 28440 GGCATAGCCC CTTGACCCGC GTGGGCAAGC GGGCGGGTCTGCAGTTGTGA GGCTTT 28500 CCCGCTGCTT CCCGCTCAGG CCTCCCTCCC TAGGAAAGCTTCACCCTGGC TGGGTC 28560 TCACCTTTTA TCACGATGTT TTAGTTTCTC CGCCCTCCGGCCAGCAGAGT TTCACA 28620 GAAGGGCGCC ACGGCTCTAG TCTGGGCCTT CTCAGTACTTGCCCAAAATA GAAACG 28680 CTGAAAACTA ATAACTTTNC TCACTTAAGA TTTCCAGGGACGGCGCCTTG GCCCGT 28740 GTTGGCTTGT TTTGTTTCGT TCTGTTTTGT TTTGTTCGTGTTTTTCCTTT CTCGTA 28800 TTTCTTTTCA GGTGAAGTAG AAATCCCCAG TTTTCAGGAAGACGTCTATT TTCCCC 28860 CACGTTAGCT GCCGTTTTTT CCTGTTGTGA ACTAGCGCTTTTGTGACTCT CTCAAC 28920 CAGTGAGAGC CGGTTGATGT TTACNATCCT TCATCATGACATCTTATTTT CTAGAA 28980 GTAGGCGAAT GCTGCTGCTG CTCTTGTTGC TGTTGTTGTTGTTGTTGTTG TCGTCG 29040 TGTTGTCGTT GTCGTTGTTG TTGTCGTTGT CGTTGTTTTCAAAGTATACC CCGGCC 29100 TTTATGGGAT CAAAAGCATT ATAAAATATG TGTGATTATTTCTTGAGCAC GCCCTT 29160 CCCCTCTCTC TGTCTCTCTG TCTGTCTCTG TCTCTCTCTTTCTCTGTCTG TCTTCT 29220 CTCTCTCTCT CTGTGTCTCT CTCTCTCTGC CTGTCTGTTTCTCTCTCTCT GCCTCT 29280 CTCTCTCTCT CTCTGCCTGT CTCTCTCACT GTGTCTGTCTTCTGTCTTAC TCCCTT 29340 TGTCTGTCTG TCGGTCTCTC TCTCTCTCTC TCCCTGTCTGTATGTTTCTC TCTGTC 29400 TCTCTCTCTC TCTTTCTGTT TCTCTCTCTC CGTCTCTGTCTTTCTCTGAC TGTCTC 29460 TTTCCTTCTC TCTGTCTCTC TCTGCCTGTC TCTCTCACTCTGTCTTCTGT CTTATC 29520 TCTCTGCCTG CCTGTCTCTC TCACTCTCTC TCTCTGTGTGTCTCTCTCTC TCTTTC 29580 TCTCTCTGTC TCTCTGTCCG TCTCTGTCTT TCTCTGTCTGTCTCTTTGTC TGTCTG 29640 TGTCTTTCCT TCTCTCTGTC TCTGTCTCTC TCACTGTGTCTGTCTTCTGT CTTAGT 29700 CTCTCTCTCT CTCCCTGTCT GTCTGTCTCT CTCTCTCTCTCCCCCTGTCT GTTTCT 29760 CTCTCTCTCT CTCTCTCTCT CTCTGTCTTT GTCTTTCTTTCTGTCTCTGT CTCTCT 29820 CTCTCTGTGT GTCTGTCTTC TGTCTTACTG TCTTTCTCTGCCTGTCTGTC TGTCTG 29880 TCTCTGTCTG TCTCTCTCTC TCTCTCCCCC TGTCGGCTGTTTCTCTGTCT CTGTCT 29940 CTCTCTTTCT GTCTGTTTCT CTCTGTCTGT CTTTCTCTCTCTGTCTCTTT CTCTCT 30000 CTCTGTCTGT CTCTGTCTCT CTCTCTGTCT CTCTCTCTCTGTGGGGGTGT GTGTGT 30060 GTGTATGTGT GTGTGTGTGT GTGTGTGTGT CTGCCTTCTGTCTTACTCTC TTTCTC 30120 TGTCTGTCTG CCTGTCTGTT TGTCTCTCTC TCTCTGCCTGTCTCTCTCCC TTCCTG 30180 TTTCTCTCTC TTTCTGTTTC TCTCTGTCTC TGTCCATCTCTGTCTTTCTC CGTCTG 30240 TTTATCTGTC TCTCTCCGTC TGTCTCTTTA TCTGTCTCTCTCTCTCTTTC TGTCTT 30300 TCTCTGTGTA TCGTTGTCTC TCTCTGTCTG TCTCTGTCTCTGTCTCTCTG TCTCTC 30360 TCTCTCTCTC TCTCTGTCTG TCTGTCCGTC TGTCTGTCTCGGTCTCTGCG TCTCGC 30420 TCCCGCCCTC TCTTTTTTTG CAAAAGAAGC TCAAGTACATCTAATCTAAT CCCTTA 30480 GGCCTGAATT CTTCACTTCT GACATCCCAG ATTTGATCTCCCTACAGAAT GCTGTA 30540 ACTGGCGAGT TGATTTCTGG ACTTGGATAC CTCATAGAAACTACATATGA ATAAAG 30600 AATCCTAAAA TCTGGGGTGG CTTCTCCCTC GACTGTCTCGAAAAATCGTA CCTCTG 30660 CCTAGGATGC CGGAAGAGTT TTCTCAATGT GCATCTGCCCGTGTCCTAAG TGATCT 30720 CCGAGCCCTG TCCGTCCTGT CTCAAATATG TACGTGCAAACACTTCTCTC CATTTC 30780 ACTACCCACG GCCCCTTGTG GAACCACTGG CTCTTTGAAAAAAATCCCAG AAGTGG 30840 GGCTTTTTGG CTAGGAGGCC TAAGCCTGCT GAGAACTTTCCTGCCCAGGA TCCTCG 30900 CATGCTTGCT AGCGCTGGAT GAGTCTCTGG AAGGACGCACGGGACTCCGC AAAGCT 30960 TGTCCCACCG AGGTCAAATG GATACCTCTG CATTGGCCCGAGGCCTCCGA AGTACA 31020 CGTCACCAAC CGTCACCGTC AGCATCCTTG TGAGCCTGCCCAAGGCCCCG CCTCCG 31080 GACTCTTGGG AGCCCGGCCT TCGTCGGCTA AAGTCCAAAGGGATGGTGAC TTCCAC 31140 AAGGTCCCAC TGAACGGCGA AGATGTGGAG CGTAGGTCAGAGAGGGGACC AGGAGG 31200 ACGTCCCGAC AGGCGACGAG TTCCCAAGGC TCTGGCCACCCCACCCACGC CCCACG 31260 ACGTCCCGGG CACCCGCGGG ACACCGCCGC TTTATCCCCTCCTCTGTCCA CAGCCG 31320 CACCCCACCA CGCAACCCAC GCACACACGC TGGAGGTTCCAAAACCACAC GGTGTG 31380 GAGCCTGACG GAGCGAGAGC CCATTTCACG AGGTGGGAGGGGTGGGGGTG GGGTGG 31440 GGGGTTGTGG GGTCTGTGGC GAGCCCGATT CTCCCTCTTGGGTGGCTACA GGCTAG 31500 GAATATCGCT TCTTGGGGGG AGGGGCTTCC TTAGGCCATCACCGCTTGCG GGACTA 31560 TCAAACCCTC CCTTGAGGCC ACAAAATAGA TTCCACCCCACCCATCGACG TTTCCC 31620 GTGCTGGATG TATCCTGTCA AGAGACCTGA GCCTGACACCGTCGAATTAA ACACCT 31680 TGGCTTTGTG TGTTTGTTTG TTTCTGAGAT GGAGTCTTGCTCTGTCCCCC AGGCTG 31740 GCAGTGGCGT GATCTCAGCT CACTGGAACC TCTGCCTCCTGGGTTCAAGT GATTCT 31800 TCTCAGCGCC ACCATGGCCG GCTCATTTTT TTTTTTTTTTTTTTTGGTAG ACACGG 31860 TCACCCTCTT TCATTGGTTT TCACTGGAGA TTCTAGATTCGAGCCACACC TCATTC 31920 CCACAGAGAG ACTTCTTTTT TTTTTTTTTT TTTTTAAGCGCAACGCAACA TGTCTG 31980 ATTTGAGTGG CTTCCTATAT CATTATAATT GTGTTATAGATGAAGAAACG GTATTA 32040 CTGTGCTAAT GATAGTGAAA GTGAAGACAA AAGAAAGGCTATCTATTTTG TGGTTA 32100 AAAGTTGCTC AGTATTTAGA AGCTACCTAA ATACGTCAGCATTTACACTC TTCCTA 32160 AAGCTGGCCG ATCTGAATAA TCCTCCTTTA AACAAACACAATTTTTGATA GGGTTA 32220 TTTTTTAAGA ATGCGACTCC TGCAAAATAG CTGAACAGACGATACACATT TAAAAA 32280 ACAACACAAG GATCAACCAG ACTTGGGAAA AAATCGAAAACCACACAAGT CTTATG 32340 ACTGAGTTCT TAAAATAGGA CGGAGAACGT AGCTATCGGAAGAGAAGGCA GTATTG 32400 GTTGATTGTT ACGTTGGTCA GCAGTAGCTG GCACTATCTTTTTGGCCATC TTTCGG 32460 TGTAACTACT ACAGCAAAAT GAGATATGAT CCATTAAACAACATATTCGC AAATCA 32520 GTGTTTCAGT AATATAATGC TTCAGATTTA GAAGCAAATCAAATGATAGA ACTCCA 32580 TGTAATAAGT CACCCCAAAG ATCACCGTAT CTGACAAAATAACTACCACA GGGTTA 32640 TTCAGAATCA TACTTTCTTC TTGATATTTA CTTATGTATTTATTTTTTTT AATTTA 32700 TCTTGAGACG CGTCTCGCTC TGTCGCCCAG GCTGGAGTGCGATGGTGTGA TCTCGG 32760 CTGCAACCGC CACCTCCCTG GGTTCAAGCG ATTCTCCTGCCTCAGCCTCC CGAGTA 32820 GGACTACAGG TGCCCGCCAC CACGCCCAGC TAATCTTTATACTTTTAATA GAGACG 32880 TTCACCGTGT CGGCCCGGAT GGTCTCGATC TCTTGACCTCGTGACCCGCC CGCCTC 32940 TCCCAAAGTG CTGGGATGAC AGGCGTGAGC CACTGAGCCCGGCCTTCTCT TGACGT 33000 ACTATGAAGT CAGTCCAGAG AAACGCAATA AATGTCAACGGTGAGGATGG TGTTGA 33060 GAAGTAGGAC CACACTTTTT CCTATCTTAT TCAGTTGATAACAATATGAC CTAGGT 33120 ATTTCCTATG TGCCTACTTA TACACGAGTA CAAAAGAGTAAAACAGAGAG ACTGCT 33180 TAAAGGGTAC GTGAAGTTCT TCATAGTAAC TCCGTAAACTGGAACACTGT CAAAAA 33240 CAGCTAGTGA ATTGTTTCCA TGTATTTTTC TATTATCCAATAAGTGAACT ATGCTA 33300 TTTCCAGTCT CCCAAGCACT TCTTGTCCCC ATCACCACTTCGGTGCTCGA AGAAAA 33360 AGCAAATCAA GGAACACAAG CTAAAGAAAC ACACACACAAACCAAAGACA ACTACA 33420 CTGCAAAAGT TTGCTAGAAG ACTGAAACTG TTGAGTATAAGGATCTGGTA TTCTAC 33480 ATGAGTTCAC TTCAGAGTTT GTTCAAGACA TACGTTTCGTAAGGAAACAT CTTAGT 33540 AGTTATTCAG CAGTAGGTAC CATCCCTAAG TATTTTTCACCAAATCCGTG ACAATA 33600 GCTATCTAAC CAGAAAAATT AGCGAGTACG GGCACCATCCATAGGGCTTT GTCTTT 33660 TTCATTAGCA CTTACCATGC CTTACAATGT CTAGGATTGACCCTGATAGC ATTTCG 33720 CAAGCTAATG CTTTGTCCAG TTCTTCAGTG AAGACAACTCACGCCCTAAT GCGCTA 33780 CATAAGCATC ATTTGGATCC ACTTCGAGAG TTCTCTGGAAGAATTGAATC GCAATA 33840 GTTCCCGTTT GCAGACCGAA ACAGTTTCCC TGCAGCACACCAGGCCTCTG GCTGGC 33900 TTTTATCCAT GTCTGTGAAG TCTTTGGACA GAACTGAAAGAGCAACCTCT TTCGGA 33960 GCCAAAGTGT TGTAGAGTAG ATCTCCATGC CTTCGACTCTGTAATTCTCA ATCCTC 34020 CCTCTGAGAA TTGTCTTTCA GCTTGCGTGG ACTCTGAAAGTTTACAATAG GCCNTT 34080 ATTTGGCACA GTACCCAACC GGTATTGCAG TGGTGAGAAGCTAGATGGCT CAAGAT 34140 ATAGCTTCTT TGCCGTGGTA AGAACACAAA GCTAAATAACCTTTCCCCCT TTCACG 34200 AGGCTCATCA AGCCTTCCGC TGCTGCTTTT TGTAGATTAAAAGCCTGAAT CTGAGG 34260 ATTGCGGCTA TTTTCCCTTC TGAAATGACG GAAGAGTCCAATTTTGTCAC TTCCAG 34320 TCACTTATGT TCGGTGGAGT TATTGCTCCT TTATTAGTTTTACTTTTGGT TCTTCT 34380 GGGATTTTAG GTGGAAACTT CATTTTTAAT TTTCTCCTAATTCTCCTCGG TTGTGG 34440 GTCACTAGTC AAGAGTCGTG AATTTCTTCG AGGNCGGTGCATTTGGGGGA GATGCC 34500 TGGGGCTCAA TACCTGAGGT GTTGCCCTTG TCGGCGGACCAGAACTTTGT GTTTTT 34560 GGACTGGAGT TACCTTTCGG CTCTTTCCCC TCTGCGAGAAGACAGACGGT GTTCCG 34620 GGCCGATTCT GGCAACAGGC TTTTCTGAAG GGGCTCCGGTGGATGGCACG TCAGTG 34680 ACGGTGTCTC ATACCAGTGC AGTTTTGTCA ATAGGGTCCGTCTCCGGGAC TTGGGG 34740 TAATGGCAAA ATGCCAACAC TTGGGGTTAA TGGACTAACAGCTGCTGGTC CTCCTA 34800 ACTTCGACCA GTTTTTGGTT TATGTTGAAC CTGTTTAGATCATATGGAAG TTCCTG 34860 CAGTGGGACA GTATCAGGTG AAAGGACAGC TGAATCGATAGAAGACACTG GGGAGT 34920 ATTCAAGGAG TACTTTGAAT TGGAAGATTC TAAATTCCATCCGTTTCATT CGACGG 34980 CTGGGGTGTT TCCGTAAGAA CGGTCTCGGG CTGTCTGTGACATAAACTAG GACGAG 35040 AAGTGTTGTG GCGCAACACT TGGACAGGCA GTTGCTAAAGCTCTCTAGAG AGGTGA 35100 AAATGTTTGG TCAGGATCTG GCTTTTCCCC CCTATTTCACATCATGATTC AAAGGG 35160 CAGAGGAAAG GATTTCAACG AAGGCTCTTT TGGTCACATTCTGATCCTTT GGTAAG 35220 TCTGTCTTGC AATATACATG TCCCGACGAT GGAAGGGGAAAGCGAGCTGA ATCACC 35280 TCAGGAACGA TAATATCATC GTGGCTTTTC TGCTTATGAAACACTCCACC CGATAA 35340 TGATCCCCTT CTGCAAGCTT GCTGAGATCA ACACAACATTTCGCAAGCAG GCATTT 35400 TGCGGGGTAG TACAACTGTG TCCTTTCAAG AGTCTATATGTTTTATAGGC CTTTCC 35460 CGGTAAGAAC AGGTCGCCAG TAAGAACAAG GCTTCTTCTGAGTGTACTTC TGCATA 35520 CGTTCTGCGG GGGAAACCGC ATCTCGGTAG GCATAGTGGTTTAGTGCTTG CCATAT 35580 GCCTGGACGG GTCCCTGCAG CACCGCCATC CTCGAGGCTCAGGCCCACTT TCTGCA 35640 CACAGGCACC CCCCCCCCCC CATAGCGGCT CCGGCCCGGCCAGCCCCGGC TCATTT 35700 GCACCAGCCG CCGTTACCGG GGGATGGGGG AGTCCGAGACAGAATGACTT CTTTAT 35760 CTGACTCTGG AAAGCCCGGC GCCTTGTGAT CCATTGCAAACCGAGAGTCA CCTCGT 35820 AGAACACGGA TCCACTCCCA AGTTCAGTGG GGGGATGTGAGGGGTGTGGC AGGTAG 35880 AAGGACTCTC TTCCTTCTGA TTCGGTCTGC ACAGTGGGGCCTAGGGCTGG AGCTCT 35940 GTGCGGACCG CTGACTCCCT CTACCTTGGG TTCCCTCGGCCCCACCCTGG AACGCC 36000 CTTGGCAGAT TCTGGCCCTT TCTGGCCCTT CAGTCGCTGTCAGAAACCCC ATCTCA 36060 CGGATGCCCC GAGTGACTGT GGCTCGCACC TCTCCGGAAACATTGGAAAT CTCTCC 36120 CGCGCGGCCA CCTGAAACCA CAGGAGCTCG GGACACACGTGCTTTCGGGA GAGAAT 36180 AGAGTCTCTC GCCGACTCTC TCTTGACTTG AGTTCTTCGTGGGTGCGTGG TTAAGA 36240 GTGAGACCAG ATGTATTAAC TCAGGCCGGG TGCTGGTGGCTCACGCCTGT AACCCC 36300 CTTTGGGAGG CCGAGGCCGT AGGATCCCTC GAGGAATCGCCTAACCCTGG GGAGGT 36360 GTTGCAGTGA GTGAGCCATA GTTGTGTCAC TGTGCTCCAGTCTGGGCGAA AGACAG 36420 AGGCCCTGCC ACAGGCAGGC AGGCAGGCAG GCAGGCAGAAAGACAACAGC TGTATT 36480 TCTTCTCAGG GTAGGAAGCA AAAATAACAG AATACAGCACTTAATTAATT TTTTTT 36540 CCTTCGGACG GAGTTTCACT CTTGGTGCCC ACGCTGGAGTGCAGTGGCAC CATCTC 36600 CACCGCAACC TCCACCTCCC GCGTTCAAGC GATTCTCCTGCCTCAGCCTC CTGAGT 36660 GGGATTACAG GGAGGAGCCA CCACACCCAG CTGATTTTGTATTGTTAGTA GAGACG 36720 TTCTCCATGT GGGTCAGGCT GGTCTCGAAC TGGCGACCCCAGTGGATCTG CCCGCC 36780 CCTCCCAAAG TGCTGGGGTG ACAGGCGTGA GCCATCGTGACTGGCCGGCT ACGTTT 36840 ATTTATTTTT TTAATTATTT TACTTTTTTT TAGTTTTCCATTTTAATCTA TTTATT 36900 TACATTTATT TATTTATTTA TTTATTTACT TATTTATTTATTTTCGAGAC AGACTC 36960 TCTGCTGCCC AGGCTGGAGT GCAGCGGCGT GATCTCGGCTCACTGCAACG TCCGCC 37020 GGGTTCACGC CATTCTCCTG CCTCAGCCTC CCAAGTAGCTGGGACTACAG GCGCCC 37080 CCGTGCCCGG CTAACTTTTT GTATTTTGAG TAGAGATGGGGTTTCACTGT GGTAGC 37140 ATGGTCTCGA TCTCCTGACC CCGTGATCCG TCCACCTCGGCCTCCCAAAG TGCTGG 37200 ACAGGCGTGA GCCACCGGCC CCGGCCTATT TATCTATTTATTAACTTTGA GTCCAG 37260 TGAAACCAGT TAGTTTTTGT AATTTTTTTT TTTTTTTTTTTTTTTTGAGA CGAGGT 37320 CCGTGTTGCC AAGGCTTGGA CCGAGGGATC CACCGGCCCTCGGCCTCCCA AAAGTG 37380 GATGACAGGC GCGAGCCTAC CGCGCCCGGA CCCCCCCTTTCCCCTTCCCC CGCTTG 37440 CCCGACAGAC AGTTTCACGG CAGAGCGTTT GGCTGGCGTGCTTAAACTCA TTCTAA 37500 AAATTTGGGA CGTCAGCTTC TGGCCTCACG GACTCTGAGCCGAGGAGTCC CCTGGT 37560 CTATCACAGG ACCGTACACG TAAGGAGGAG AAAAATCGTAACGTTCAAAG TCAGTC 37620 TGTGATACAG AAATACACGG ATTCACCCAA AACACAGAAACCAGTCTTTT AGAAAT 37680 TTAGCCCTGG TGTCCGTGCC AGTGATTCTT TTCGGTTTGGACCTTGACTG AGAGGA 37740 CAGTCGGTCT CTCGTCTCTG GACGGAAGTT CCAGATGATCCGATGGGTGG GGGACT 37800 CTGCGTCCCC CCAGGAGCCC TGGTCGATTA GTTGTGGGGATCGCCTTGGA GGGCGC 37860 ACCCACTGTG CTGTGGGAGC CTCCATCCTT CCCCCCACCCCCTCCCCAGG GGGATC 37920 TTCATTCCGG GCTGACACGC TCACTGGCAG GCGTCGGGCATCACCTAGCG GTCACT 37980 CTCTGAAAAC GGAGGCCTCA CAGAGGAAGG GAGCACCAGGCCGCCTGCGC ACAGCC 38040 GCAACTGTGT CTTCTCCACC GCCCCCGCCC CCACCTCCAAGTTCCTCCCT CCCTTG 38100 CTAGGAAATC GCCACTTTGA CGACCGGGTC TGATTGACCTTTGATCAGGC AAAAAC 38160 AAACAGATAA ATAAATAAAA TAACACAAAA GTAACTAACTAAATAAAATA AGTCAA 38220 ACCCATTACA ATACAATAAG ATACGATACG ATAGGATGCGATAGGATACG ATAGGA 38280 ATACAATAGG ATACGATACA ATACAATACA ATACAATACAATACAATACA ATACAA 38340 ATACAATACA ATACAATACG CCGGGCGCGG TGGCTCATGCCTGTCATCCC GTCACT 38400 GATGCCGAGG TGGACGCATC ACCTGAAGTC GGGAGTTGGAGACAAGCCCG ACCAAC 38460 AGAAATCCCG TCTCAATTGA AAATACAAAA CTAGCCGGGCGCGGTGGCAC ATGCCT 38520 TCCCAGCTGC TAGGAAGGCT GAGGCAGGAG AATCGCTTGAACCTGGGAAG CGGAGG 38580 AGTGAGCCGA GATTGCGCCA TCGCACTCCA GTCTGAGCAACAAGAGCGAA ACTCCG 38640 AAAAATAAAT ACATAAATAA ATACATACAT ACATACATACATACATACAT ACATAC 38700 ATAAATTAAA ATAAATAAAT AAAATAAAAT AAATAAATGGGCCCTGCGCG GTGGCT 38760 CCTGTCATCC CCTCACTTTG GGAGGCCAAG GCCGGTGGATCAAGAGGCGG TCAGAC 38820 AGGGCCAGTA TGGTGAAACC CCGTCTCTAC TCACAATACACAACATTAGC CGGGCG 38880 GCTGTGCTGT ACTGTCTGTA ATCCCAGCTA CTCGGGAGGCCGAGCTGAGG CAGGAG 38940 GCTTGAACCT GGGAGGCGGA GGTTGCAGTG AGCCGAGATCGCGCCACTGC AACCCA 39000 GGGCGACAGA GCGAGACTCC GTCTCCAAAA AATGAAAATGAAAATGAAAC GCAACA 39060 AATTAAAAAG TGAGTTTCTG GGGAAAAAGA AGAAAAGAAAAAAGAAAAAA ACAACA 39120 AGAACAACCC CACCGTGACA TACACGTACG CTTCTCGCCTTTCGAGGCCT CAAACA 39180 AGGAATTATG CGTGATTTCT TTTTTTAACT TCATTTTATGTTATTATCAT GATTGA 39240 TCGAGACGGA GTCTCGGAGG CCCGCCCTCC CTGGTTGCCCAGACAACCCC GGGAGA 39300 CCCTGGCTGG GCCCGATTGT TCTTCTCCTT GGTCAGGGGTTTCCTTGTCT TTCTTC 39360 CTTTAACCCG CGTGGACTCT TCCGCCTCGG GTTTGACAGATGGCAGCTCC ACTTTA 39420 TTGTTGTTGT TGGGGACTTT CCTGATTCTC CCCAGATGTAGTGAAAGCAG GTAGAT 39480 TTGCCTGGCC TTGCCTGGCC TTGCCTTTTC TTTCTTTCTTTCTTTCTTTA TTACTT 39540 TTTTTCTTCT TCTTCTTCTT CTTTTTTTTG AGACAGAGTTTCACTCTTGT TGCCCA 39600 AGAGGGCAAT GGCGCGATCT CGGCTCACCG CACCCTCCGCCTCCCAGGTT CAAGCG 39660 TCCTGCCTCA GCCTCCTGAT TAGCTGGGAT TACAGGCATGGGCCACCGTG CTGGCT 39720 TTTGTACTTT TAGTAGAGAC GGTGTTTTTC CATGTTGGTCAGGCTGGTCT CCCACT 39780 ACCTCAGGTG GTCCGCCTGC CTTAGCCTCC CAAAGTGCTGGGATGACAGG CGTGCA 39840 CGCCCAGCCT CTCTCTCTCT CTCTCTCTCT CTCGCTCGCTTGCTTGCTTG CTTTCG 39900 TTCTTGCTTT CCCGTTTTCT TGCTTTCTTT CTTTCTTTCGTTTCTTTCAT GCTTGC 39960 TTGCTTGCTT GCTTGCTTTC GTGCTTTCTT GCTTTCCTGTTTTCTTTCTT TCTTTC 40020 TTTCTTTCTT TTGTTTCTTT CTTGCTTGCT TTCTTGCTTGCTTGCTTGCT TTCGTG 40080 CTTGCTTTCC TGTTTTCTTT CTTTCTTTCT TTCTTTTCTTTCTTTCTTGC TTGCTT 40140 GCTTGCTTGC TTTCGTGCTT TCTTGTTTTC TCGATTTCTTTCTTTCTTTT GTTTCT 40200 TGCTTGCTTT CTTGCTTGCT TGCTTTCGTG CTTCTTGCTTTCCTGTTTTC TTTCTT 40260 TCTTTCTTTT GTTTCTTTCT TGCTTGCTTT CTTGCTTGCTTGCTTTCGTG CTGTCT 40320 TCTCGATTTC TTTCTTTCTT TTGTTTCTTT CCTGCTTGCTTTCTTGCTTG ATTGCT 40380 TGCTTTCTTG CTTTCTTGTT TTCTTTCTTT CTTTTGTTTCTTTCTTTCTT GCTTCC 40440 TTTCTTGCTT TCTTGCTTGC TTGCTTTCGT GCTTTCTTGTTTTCTTGCTT TCTTTC 40500 GTTTCTTTCT TGCTTGCTTT CTTGCTTCCT TGTTTTCTTGCTTTCTTGCT TGCTTG 40560 CGTGCTTTCT TTCTTGCTTT CTTTTCTTTC TTTCTTTTCTTTTTCTTTCT TTCTTG 40620 CTTTTCTTTC ATCATCATCT TTCTTTCTTT CCTTTCTTTCTTTCTTTCTT TCTATC 40680 TTTCTTTCTT TCTTTCTTTC TTTCTTTCTT TCTTTCTGTTTCGTCCTTTT GAGACA 40740 TTCACTCTTG TTTCCACGGC TAGAGTGCAA TGGCGCGATCTTGGCTCACC GCACCT 40800 CCTCCCGGGT TCGAGCGCTT CTCCTGCCTC CAGCCTCCCGATTAGCGGGG ATTGAC 40860 AGGCACCCCC ACGCCTGGCT TGGCTGATGT TTGTGTTTTTAGTAGGCACG CCGTGT 40920 CCATGTTGCT CAGGCTGGTC TCCAACTCCC GACCTCCTGTGATGCGCCCA CCTCGG 40980 TCGAAGTGCT GGGATGACGG GCGTGACGAC CGTGCCCGGCCTGTTGACTC ATTTCG 41040 TTTATTTCTT TCGTTTCCAC GCGTTTACTT ATATGTATTAATGTAAACGT TTCTGT 41100 TTATATGCAA ACAACGACAA CGTGTATCTC TGCATTGAATACTCTTGCGT ATGGTA 41160 CGTATCGGTT GTATGGAAAT AGACTTCTGT ATGATAGATGTAGGTGTCTG TGTTAT 41220 ATAAATACAC ATCGCTCTAT AAAGAAGGGA TCGTCGATAAAGACGTTTAT TTTACG 41280 AAAAGCGTCG TATTTATGTG TGTAAATGAA CCGAGCGTACGTAGTTATCT CTGTTT 41340 TCTTCCTCTC CTTCGTGTTT TTCTTCCTTC CTTTCTTCCTTTCTCTCCTT CTTTAG 41400 TTCTTCCTCT CTTCCTTTCC TTCTTTCTCT CTTTCTGTCCTTTTTTCCTT CGTGCT 41460 TTCTCTTTCG TTCCCTGTGT TTCCTTCTTT TTTCTTTCCTCTCTGTTTCT TTTTCC 41520 TTTCCTTCGT TTCTTTCCTC ATTCTTTCTC TCTTTTTCGTTGTTTCTTTC CTTCCC 41580 GTCTTTTAAA AAATTGGAGT GTTTCAGAAG TTTACTTTGTGTATCTACGT TTTCTA 41640 GTCTCTCTTT TCTCCATTTT CTTCCTCCCT CCCTCCCTCCCTCCCTGCTC CCTTCC 41700 CTCCTTCCCT TTCGCCATCT GTCTCTTTTC CCCACTCCCCTCCCCCCGTC TGTCTC 41760 TGGATTCCGG AAGAGCCTAC CGATTCTGCC TCTCCGTGTGTCTGCAGCGA CCCCGC 41820 GAGTCCTTGT GTGTTCTTTC TCCCTCCCTC CCTCCCTCCCTCCCTCCCTC CCTCCC 41880 TCCGAGAGGC ATCTCCAGAG ACCGCGCCGT GGGTTGTCTTCTGACTCTGT CGCGGT 41940 GCAGAGACGC GTTTTGGGCA CCGTTTGTGT GGGGTTGGGGCAGAGGGGCT GCGTTT 42000 CCTCGGGAAG AGCTTCTCGA CTCACGGTTT CGCTTTCGCGGTCCACGGGC CGCCCT 42060 GCCGGATCTG TCTCGCTGAC GTCCGCGGCG GTTGTCGGGCTCCATCTGGC GGCCGC 42120 AGATCGTGCT CTCGGCTTCC GGAGCTGCGG TGGCAGCTGCCGAGGGAGGG GACCGT 42180 GCTGTGAGCT AGGCAGAGCT CCGGAAAGCC CGCGGTCGTCAGCCCGGCTG GCCCGG 42240 GCCAGAGCTG TGGCCGGTCG CTTGTGAGTC ACAGCTCTGGCGTGCAGGTT TATGTG 42300 AGAGGCTGTC GCTGCGCTTC TGGGCCCGCG GCGGGCGTGGGGCTGCCCGG GCCGGT 42360 CAGCGCGCCG TAGCTCCCGA GGCCCGAGCC GCGACCCGGCGGACCCGCCG CGCGTG 42420 AGGCTGGGGA CGCCCTTCCC GGCCCGGTCG CGGTCCGCTCATCCTGGCCG TCTGAG 42480 CGGCCGAATT CGTTTCCGAG ATCCCCGTGG GGAGCCGGGGACCGTCCCGC CCCCGT 42540 CGGGTGCCGG GGAGCGGTCC CCGGGCCGGG CCGCGGTCCCTCTGCCGCGA TCCTTT 42600 CGAGTCCCCG TGGCCAGTCG GAGAGCGCTC CCTGAGCCGGTGCGGCCCGA GAGGTC 42660 TGGCCGGCCT TCGGTCCCTC GTGTGTCCCG GTCGTAGGAGGGGCCGGCCG AAAATG 42720 CGGCTCCCGC TCTGGAGACA CGGGCCGGCC CCTGCGTGTGGCCAGGGCGG CCGGGA 42780 TCCCCGGCCC GGCGCTGTCC CCGCGTGTGT CCTTGGGTTGACCAGAGGGA CCCCGG 42840 TCCGTGTGTG GCTGCGATGG TGGCGTTTTT GGGGACAGGTGTCCGTGTCC GTGTCG 42900 TCGCCTGGGC CGGCGGCGTG GTCGGTGACG CGACCTCCCGGCCCCGGGGG AGGTAT 42960 TTCGCTCCGA GTCGGCAATT TTGGGCCGCC GGGTTATAT 42999(2) INFORMATION FOR SEQ ID NO: 18: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 175 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: CTCCCGCGCG GCCCCCGTGTTCGCCGTTCC CGTGGCGCGG ACAATGCGGT TGTGCGTCC 60 CGTGTGCGTG TCCGTGCAGTGCCGTTGTGG AGTGCCTCGC TCTCCTCCTC CTCCCCGG 120 GCGTTCCCAC GGTTGGGGACCACCGGTGAC CTCGCCCTCT TCGGGCCTGG ATCCG 175 (2) INFORMATION FOR SEQ IDNO: 19: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 755 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 19: GGTCTGGTGG GAATTGTTGA CCTCGCTCTC GGGTGCGGCCTTTGGGGAAC GGCGGGGTC 60 GTCGTGCCCG GCGCCGGACG TGTGTCGGGG CCCACTTCCCGCTCGAGGGT GGCGGTGG 120 GCGGCGTTGG TAGTCTCCCG TGTTGCGTCT TCCCGGGCTCTTGGGGGGGG TGCCGTCG 180 TTCGGGGCCG GCGTTGCTTG GCTTACGCAG GCTTGGTTTGGGACTGCCTC AGGAGTCG 240 GGCGGTGTGA TTCCCGCCGG TTTTGCCTCG CGTCTGCCTGCTTTGCCTCG GGTTTGCT 300 GTTCGTGTCT CGGGAGCGGT GGTTTTTTTT TTTTTCGGGTCCCGGGGAGA GGGGTTTT 360 CGGGGGACGT TCCCGTCGCC CCCTGCCGCC GGTGGGTTTTCGTTTCGGGC TGTGTTCG 420 TCCCCTTCCC CGTTTCGCCG TCGGTTCTCC CCGGTCGGTCGGCCCTCTCC CCGGTCGG 480 GCCCGGCCGT GCTGCCGGAC CCCCCCTTCT GGGGGGGATGCCCGGGCACG CACGCGTC 540 GGCGGCCACT GTGGTCCGGG AGCTGCTCGG CAGGCGGGTGAGCCAGTTGG AGGGGCGT 600 TGCCCCCGCG GGCTCCCGTG GCCGACGCGG CGTGTTCTTTGGGGGGGCCT GTGCGTGC 660 GAAGGCTGCG CACGTTGTCG GTCCTTGCGA GGGAAAGAGGCTTTTTTTTT TTAGGGGG 720 GTCCTTCGTC GTCCCGTCGG CGGTGGATCC GGCCT 755 (2)INFORMATION FOR SEQ ID NO: 20: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:463 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: GGCCGAGGTG CGTCTGCGGGTTGGGGCTCG TCCGGCCCCG TCGTCCTCCG GGAAGGCGT 60 TAGCGGGTAC CGTCGCCGCGCCGAGGTGGG CGCACGTCGG TGAGATAACC CCGAGCGT 120 TTCTGGTTGT TGGCGGCGGGGGCTCCGGTC GATGTCTTCC CCTCCCCCTC TCCCCGAG 180 CAGGTCAGCC TCCGCCTGTGGGCTTCGTCG GCCGTCTCCC CCCCCCTCAC GTCCCTCG 240 AGCGAGCCCG TCCGTTCGACCTTCCTTCCG CCTTCCCCCC ATCTTTCCGC GCTCCGTT 300 CCCCGGGGTT TTCACGGCGCCCCCCACGCT CCTCCGCCTC TCCGCCCGTG GTTTGGAC 360 CTGGTTCCGG TCTCCCCGCCAAACCCCGGT TGGGTTGGTC TCCGGCCCCG GCTTGCTC 420 CGGGTCTCCC AACCCCCGGCCGGAAGGGTT CGGGGGTTCC GGG 463 (2) INFORMATION FOR SEQ ID NO: 21: (i)SEQUENCE CHARACTERISTICS: (A) LENGTH: 378 base pairs (B) TYPE: nucleicacid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE:Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENTTYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ IDNO: 21: GGATTCTTCA GGATTGAAAC CCAAACCGGT TCAGTTTCCT TTCCGGCTCC GGCCGGGGG60 GGCGGCCCCG GGCGGTTTGG TGAGTTAGAT AACCTCGGGC CGATCGCACG CCCCCCGT 120CGGCGACGAC CCATTCGAAC GTCTGCCCTA TCAACTTTCG ATGGTAGTCG ATGTGCCT 180CATGGTGACC ACGGGTGACG GGGAATCAGG GTTCGATTCC GGAGAGGGAG CCTGAGAA 240GGCTACCACA TCCAAGGAAG GCAGCAGGCG CGCAAATTAC CCACTCCCGA CCCGGGGA 300TAGTGACGAA AAATAACAAT ACAGGACTCT TTCGAGGCCC TGTAATTGGA ATGAGTCC 360TTTAAATCCT TTAAGCAG 378 (2) INFORMATION FOR SEQ ID NO: 22: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 378 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: GATCCATTGG AGGGCAAGTC TGGTGCCAGC AGCCGCGGTA ATTCCAGCTC CAATAGCGT 60TATTAAAGTT GCTGCAGTTA AAAAGCTCGT AGTTGGATCT TGGGAGCGGG CGGGCGGT 120GCCGCGAGGC GAGTCACCGC CCGTCCCCGC CCCTTGCCTC TCGGCGCCCC CTCGATGC 180TTAGCTGAGT TGTCCCGCGG GGCCCGAAGC GTTTACTTTG AAAAAATTAG AGTTGTTT 240AAGCAGGCCC GAGCCGCCTG GATACCGCCA GCTAGGAAAT AATGGAATAG GACCGCGG 300CCTATTTTGT TTGGTTTTCG GAACTGAGCC CATGATTAAG GGAAACGGCC GGGGGCAT 360CCTTATTGCG CCCCCCTA 378 (2) INFORMATION FOR SEQ ID NO: 23: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 719 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: GGATCTTTCC CGCTCCCCGT TCCTCCCGGC CCCTCCACCC GCGCGTCTCC CCCCTTCTT 60TCCCCTCTCC GGAGGGGGGG GAGGTGGGGG CGCGTGGGCG GGGTCGGGGG TGGGGTCG 120GGGGGACCGC CCCCGGCCGG CAAAAGGCCG CCGCCGGGCG CACTTCAACC GTAGCGGT 180GCCGCGACCG GCTACGAGAC GGCTGGGAAG GCCCGACGGG GAATGTGGCT CGGGGGGG 240GGCGCGTCTC AGGGCGCGCC GAACCACCTC ACCCCGAGTG TTACAGCCCT CCGGCCGC 300TTTCGCGGAA TCCCGGGGCC GAGGGGAAGC CCGATACCCG TCGCCGCGCT TTTCCCCT 360CCCCGTCCGC CTCCCGGGCG GGCGTGGGGG TGGGGGCCGG GCCGCCCCTC CCACGCCC 420GGTTTCTCTC TCTCCCGGTC TCGGCCGGTT TGGGGGGGGG AGCCCGGTTG GGGGCGGG 480GGACTGTCCT CAGTGCGCCC CGGGCGTCGT CGCGCCGTCG GGCCCGGGGG GTTCTCTC 540TCACGCCGCC CCCGACGAAG CCGAGCGCAC GGGGTCGGCG GCGATGTCGG CTACCCAC 600GACCCGTCTT GAAACACGGA CCAAGGAGTC TAACGCGTGC GCGAGTCAGG GGCTCGCA 660AAAGCCGCCG TGGCGCAATG AAGGTGAAGG GCCCCGTCCG GGGGCCCGAG GTGGGATC 719 (2)INFORMATION FOR SEQ ID NO: 24: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:685 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: CGAGGCCTCT CCAGTCCGCCGAGGGCGCAC CACCGGCCCG TCTCGCCCGC CGCGTCGGG 60 AGGTGGAGCA CGAGCGTACGCGTTAGGACC CGAAAGATGG TGAACTATGC CTGGGCAG 120 CGAAGCCAGA GGAAACTCTGGTGGAGGTCC GTAGCGGTCC TGACGTGCAA ATCGGTCG 180 CGACCTGGGT ATAGGGGCGAAAGACTAATC GAACCATCTA GTAGCTGGTT CCCTCCGA 240 TTTCCCTCAG GATAGCTGGCGCTCTCGCAA CCTTCGGAAG CAGTTTTATC CGGGTAAA 300 CGGAATGGAT TAGGAGGTCTTGGGGCCGGA AACGATCTCA AACTATTTCT CAAACTTT 360 ATGGGTAAGG AAGCCCGGCTCGCTGGCGTG GAGCCGGGCG TGGAATGCGA GTGCCTAG 420 GGCCACTTTT GGTAAGCAGAACTGGCGCTG CGGGATGAAC CGAACGCCGG GTTAAGGC 480 CCGATGCCGA CGCTCATCAGACCCCAGAAA AGGTGTTGGT TGATATAGAC AGCAGGAC 540 TGGCCATGGA AGTCGGAATCCGCTAAGGAG TGTGTAACAA CTCACCTGCC GAATCAAC 600 GCCCTGAAAA TGGATGGCGCTGGAGCGTCG GGCCCATACC CGGCCGTCGC CGGCAGTC 660 AACGGGACGG GACGGGAGCGGCCGC 685 (2) INFORMATION FOR SEQ ID NO: 25: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: GAGGAATTCC CCTATCCCTA ATCCAGATTG GTG 33 (2) INFORMATION FOR SEQ IDNO: 26: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 35 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 26: AAACTGCAGG CCGAGCCACC TCTCTTCTGT GTTTG 35(2) INFORMATION FOR SEQ ID NO: 27: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: AGGAATTCAC AGAAGAGAGGTGGCTCGGCC TGC 33 (2) INFORMATION FOR SEQ ID NO: 28: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 34 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: AGCCTGCAGG AAGTCATACC TGGGGAGGTG GCCC 34 (2) INFORMATION FOR SEQ IDNO: 29: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 80 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 29: AAACTGCAGG TTAATTAACC CTAACCCTAA CCCTAACCCTAACCCTAACC CTAACCCTA 60 CCCTAACCCT AACCCGGGAT 80 (2) INFORMATION FOR SEQID NO: 30: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 19 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 30: TTGGGCCCTA GGCTTAAGG 19 (2) INFORMATION FORSEQ ID NO: 31: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 25 base pairs(B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear(ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE:NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 31: GCCAGGGTTT TCCCAGTCAC GACGT 25 (2)INFORMATION FOR SEQ ID NO: 32: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: GCTGCAAGGC GATTAAGTTG GGTAAC26 (2) INFORMATION FOR SEQ ID NO: 33: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: TATGTTGTGT GGAATTGTGAGCGGAT 26 (2) INFORMATION FOR SEQ ID NO: 34: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: GGGTTTAAAC AGATCTCTGC A 21

What is claimed:
 1. A method for amplifying nucleic acid, comprising:introducing a nucleic acid molecule into a plant cell, wherein thenucleic acid molecule include a sequence of nucleotides that targets itto an amplifiable region of a chromosome in the cell; growing the cell;and identifying from among the resulting cells those that include achromosome with a portion that has undergone amplification.
 2. Themethod of claim 1, wherein the targeting sequence of nucleotide(s) isselected from among those that target the molecule to the pericentricheterochromatic region of a chromosome.
 3. The method of claim 1,wherein the targeting sequence comprises rDNA.
 4. The method of claim 1,wherein the targeting sequence comprises an origin of replication or anamplification promoting sequence (APS).
 5. The method of claim 1,wherein the plant is tobacco, rice, maize, rye, soybean, Brassica napus,cotton, lettuce, potato, tomato or arabidopsis.
 6. A method foramplifying a nucleic acid, comprising: introducing nucleic acid fragmentcomprising sequences of nucleotides targeted to an amplifiable region ofa chromosome into a plant cell under conditions whereby the fragmentintegrates into the chromosome.
 7. The method of claim 6, furthercomprising replicating the cell.
 8. The method of claim 6, wherein thetargeting sequences of nucleotides are selected from among those thattarget the molecule to the pericentric heterochromatic region of achromosome.
 9. The method of claim 6, wherein the targeting sequencescomprise rDNA.
 10. The method of claim 6, wherein the targetingsequences comprise an origin of replication or an amplificationpromoting sequence (APS).
 11. The method of claim 6, wherein the plantis tobacco, rice, maize, rye, soybean, Brassica napus, cotton, lettuce,potato, tomato or arabidopsis.
 12. A method for amplifying a nucleicacid, comprising: introducing nucleic acid fragment that comprises rDNAinto a plant cell under conditions that produce cells that haveincorporated the DNA fragment or a portion thereof that comprises therDNA into a chromosome of the cell.
 13. The method of claim 12, whereinthe plant is tobacco, rice, maize, rye, soybean, Brassica napus, cotton,lettuce, potato, tomato or arabidopsis.
 14. A method for amplifying anucleic acid, comprising: introducing nucleic acid fragment thatcomprises an origin of replication or an amplification promotingsequence into a plant cell under conditions to produce cells that haveincorporated the DNA fragment or a portion thereof that comprises theorigin of replication or an amplification promoting sequence into achromosome of the cell.
 15. The method of claim 14, wherein the plant istobacco, rice, maize, rye, soybean, Brassica napus, cotton, lettuce,potato, tomato or arabidopsis.
 16. A nucleic acid molecule, comprising:nucleic acid encoding a gene product or gene products; a selectablemarker; and sequences of nucleotides targeted to an amplifiable regionof a chromosome in a cell.
 17. The nucleic acid molecule of claim 16,wherein the targeting sequences of nucleotides are selected from amongthose that target the molecule to the pericentric heterochromatic regionof a chromosome.
 18. The nucleic acid molecule of claim 16, wherein thetargeting sequences comprise rDNA.
 19. The nucleic acid molecule ofclaim 16, wherein the targeting sequences comprise an origin ofreplication or an amplification promoting sequence (APS).
 20. Thenucleic acid molecule of claim 16, wherein the gene products encode abiosynthetic pathway.
 21. The nucleic acid molecule of claim 16 that isa plasmid.
 22. The nucleic acid molecule of claim 16, wherein the cellis a plant cell.
 23. The nucleic acid molecule of claim 16, wherein theplant is tobacco, rice, maize, rye, soybean, Brassica napus, cotton,lettuce, potato, tomato or arabidopsis.